[Python-Dev] Bug #537450 Improper object initialization for type.__new__

Kevin Jacobs jacobs@penguin.theopalgroup.com
Sun, 6 Oct 2002 13:13:30 -0400 (EDT)


Hi Guido and python-dev,

Python 2.2.2 beta 1 is coming out, and I was hoping to float a final request
to have the fix for bug #537450 applied to the 2.2 maint branch (it is a
trivial backport, since the 2.3 patch applies just fine).  A back-port was
originally passed over, because it changes semantics from Python 2.2.1.  I
argue that the changed semantics are desirable and constitute a clear bug
fix, which closes holes that can allow very obscure errors.  It would be
very surprising to me if this change breaks any extant code, since it is
virtually impossible to take meaningful or safe advantage of the original
semantics.

Here is, in brief, an example of what I am talking about (the bug and
python-dev archives have more information on this topic).  Let's define a
class Foo:

class Foo(object):
  def __new__(cls):
    return [1,2,3]

print Foo()

Okay, this _is_ a contrived example, though it does reflects a real use-case
that appears in one of my large real-world apps.  Here is the output from
Python 2.2.1:

  >>> print Foo()
  []         # !!! not what we expected!

and from Python 2.3 CVS:

  >>> print Foo()
  [1, 2, 3]  # Just right!

This is because any object returned from __new__ has its constructor called,
regardless of its type in 2.2.1.  The new rule in Python 2.3 CVS is that the
constructor is only called when the type returned is an instance of the type
being built (including subtype instances).

The above example demonstrates the problems with the 2.2.1 behavior on lists,
though more problems are exposed when we consider other object types:

class Bar(object):
  def __init__(self, a, b, c):
    pass

class Foo(object):
  def __new__(cls,a):
    if a:
      return Bar(1,2,3)
    else:
      return super(Foo, cls).__new__(cls)

Here is what happens in Python 2.3 CVS:

  >>> print Foo(0)
  <__main__.Foo object at 0x402baf0c>
  >>> print Foo(1)
  <__main__.Bar object at 0x402bab6c>

All looks as it should, based on the discussions from when bug #537450 was
last discussed.

Here is what happens in Python 2.2.1 (and the current 2.2.2 CVS):

  >>> print Foo(0)
  <__main__.Foo object at 0x8174434>
  >>> print Foo(1)
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: __init__() takes exactly 4 arguments (2 given)


In the first case where a=0, the super-class new is called, and a Foo
instance is returned.  All is fine, so far.  

However, the second case returns an initialized Bar instance, which Foo
tries to initialize again by calling Bar.__init__ with the same arguments
that were passed into Foo.__new__.  An arity error results, which is
actually a good thing, since arbitrarily calling another object's
constructor with arguments passed in for another object is a decidedly bad
thing to do in general.  The more severe problem occurs when the arguments
do conform, and an instance of an improperly constructed object is returned. 
Either way, Bar is initialized _twice_, once with the intended arguments,
another with arguments that were intended for Foo!

I assert that no sensible developer would ever write code that relied on the
pre-2.3 CVS semantics.  I've done a survey of much of the meta-class code
posted to or discussed on Python-dev and comp.lang.python, and none rely on
this behavior.  Plus, even if they did, we have already decided they will
have to change it when 2.3 comes out.

If you are not entirely convinced by the above argument, please consider a
compromise solution: put the fix in for 2.2.1 Beta 1.  If anyone has
complaints based on extant code, then we can always back it out for Beta 2,
or the final release.

Thanks for taking the time...

-Kevin Jacobs

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com