[Python-Dev] Meta-reflections

Kevin Jacobs jacobs@penguin.theopalgroup.com
Thu, 21 Feb 2002 07:36:14 -0500 (EST)


On Thu, 21 Feb 2002, Moore, Paul wrote:
> > I agree, but that "official" support has clear limitations.
>
> I'm not sure what you mean.

When you request dir(object), there is a fairly significant amount of work
done.  Even though it is implemented in C, I can foresee a non-trivial
performance hit to a great deal of code.  vars(object) is better, though it
also has some performance implications.

I know that we are talking about Python, and performance is not of paramount
importance.  But realize that my company produces _extremely_ large Python
applications used for financial and business tracking and forecasting.  We
are acutely aware of bottle-necks in critical paths such as object
serialization.  I'm just not looking forward to 25% slowdowns in pickling
(number pulled out of hat) and I'm sure the Zope guys aren't either...

> > Right now, there are several examples in the Python standard
> > library where we use obj.__dict__.keys() --  most significantly
> > in pickle and cPickle.
>
> But aren't we agreed that this is the source of a bug (that slots aren't
> picklable)?

That is not the bug -- if for no other reason, the standard library is free
to use implementation specific knowledge.  Getting obj.__dict__ is a really
slick and efficient way to reflect on all normal instance variables.

> > Naive, maybe, but saying "undocumented" is equivalent to "unsupported
> > implementation detail" saves us from having to maintain backward
> > compatibility and following the official Python deprecation process.
>
> Equally, saying "it's not in the manual, so tough luck" is unreasonably
> harsh. We need to be reasonable about this.

I don't mean to imply that we need to be harsh, though in some classes we do
not want to worry about backward compatibility.  How do we tell users which
features are safe to use, so that they don't write thousands of lines of
code that suddenly break when the next Python version is released?  Well,
not documenting it in the official Python reference manual is a pretty good
way.  Personally, I'm extremely wary of using anything that isn't.  Such
features can also be documented in the reference manual and explicitly
marked as "subject to change", but that is not the case we are currently
dealing with.

> > Yes.  Dict-based attributes always have values, while
> > slot-based attributes can be unset and raise AttributeErrors
> > when trying to access them.
>
> Hmm. I could argue this a couple of ways. Slots should contain None when
> unassigned (no AttributeErrors), or code should be (re-)written to cope with
> AttributeError from things in dir(). I wouldn't argue it as "slots can raise
> AttributeError, so we need to treat slots and dict-based attibutes
> separately, in 2 passes".

I don't see how filling slots with default values is compatible with the
premise that we want slots to act as close to normal instance attributes as
possible.  I've implemented quite a few things with slots.  In fact, I have
an experimental branch of a 200k LOC project that re-implements many low level
components using slots.  The speed-ups and memory savings from doing so are
very, very compelling.  There are cases where I declare slots that may never
be used using any particular instance lifetime.  I do expect them to raise
an AttributeError, just like they would have before they were slots.  If I
fill the slot, assigning it to None has another very different semantic
meaning than an AttributeError.

Another good example is pickling.  Why would you ever want to pickle empty
slots?  They have no value, not even a default one, so why waste the
processor cycles or the disk space?

> Why not just change the line stuff = object.__dict__ to
>
>     stuff = [a for a in dir(object) if hasattr(object,a) and not
> callable(getattr(object,a))]

Um, because its wrong?  I pickle _lots_ of callable objects.  It also
pickles class-attributes as instance-attributes.  Here is a better version
that can be used once vars(object) has been fixed:

  stuff = dict([ (a,getattr(object,a)) for a in vars(object) if hasattr(object,a)])

Note that it does an unnecessary getattr, hasattr, memory allocation and
incurs loop overhead on every dict attribute, but otherwise it should work
once vars is fixed.

> 1. Make unbound slots return None rather than AttributeError

I am strongly against this.  It doesn't make sense to start supplying
implicit default values.  Explicit is better than implicit...

> 2. Make vars() return slots as well as dict-based attributes

Agree.

> 3. Document __dict__ as legacy usage, not slots-aware

Agree, though __dict__ should still be a valid way of accessing all non-slot
instance attributes.  Too much legacy code would break if this were not so.

> 4. Fix bugs caused by use of the legacy __dict__ approach

I'd rephrase that as: fix reflection code that assumes attributes only live
in __dict__.

> 5. Educate users in the new approaches which are slots-aware
>    (dir/vars, calling base class setattr, etc)

Calling base class setattr?  I'm not sure what you mean?

> >   2) Flat slot descriptions:  object.__slots__ being an
> >      immutable flat tuple of all slot names (including
> >      inherited ones), as opposed to being a potentially
> >      mutable sequence of only the slots defined by the most
> >      derived class.
>
> This I disagree with. I think __slots__ should be immutable. But I'm happy
> with "don't do that" as the way of implementing immutability, if that's what
> Guido prefers.

Not knowing what Guido is planning, all I can say is that he has made
__bases__ and __mro__ explicitly immutable.  If we now intend to make
__slots__ immutable as well, then it is better to do so explicitly.

> I definitely don't think __slots__ should return something
> different when you read it than what you assigned to it in the first place
> (which is what including inherited slots does). But I don't really think
> people have any business reading __slots__ in any case (see my arguments
> above).

By your logic, people don't have any business reading __dict__, but they do.
Imagine what would happen if we didn't expose __dict__ in Python 2.3?

Thanks,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com