RE: [Python-Dev] Meta-reflections
From: Kevin Jacobs [mailto:jacobs@penguin.theopalgroup.com]
Having a flat namespace (i.e., no hidden slots), and having all 'reachable' slots in a single list are really two separate issues. Right now, we have this situation:
class Foo(object): __slots__ = ('a','b')
class Bar(Foo): __slots__ = ('c','d')
('c', 'd')
bar=Bar() print bar.__slots__ print bar.__class__.__base__.__slots__
('a', 'b')
I content that bar.__slots__ should be:
('a', 'b', 'c', 'd')
I think somewhere along the line I may have mixed up which 'flatness' I was> talking about.
Um. Be aware that I'm not 100% sure about the "no hidden slots" point. I only support it on the basis of acting the same as attributes. But I'm not sure about it for attributes, either... (although I doubt it will ever change). As for your contention over __slots__, I don't agree. I don't have a strong view, but I feel that __slots__ is really only meant as a way of *defining* the slots (and as such, may be replaced later by better syntax). Think of it as write-only. Modifying it after the class is defined, or reading it, aren't really well defined (and don't need to be). If slot definition had been spelt "slot a" instead of "__slots__ = ['a']", you wouldn't necessarily expect to have a readable attribute containing the list of slots...
The official, and supported, way is to use dir().
I agree, but that "official" support has clear limitations.
I'm not sure what you mean.
Right now, there are several examples in the Python standard library where we use obj.__dict__.keys() -- most significantly in pickle and cPickle.
But aren't we agreed that this is the source of a bug (that slots aren't picklable)?
There is also the vars(obj), which may be the better reflection method, though it currently doesn't know about slots.
Possibly you're right. This could easily be raised as a feature request. And possibly even as a bug (vars() should know about slots).
Naive, maybe, but saying "undocumented" is equivalent to "unsupported implementation detail" saves us from having to maintain backward compatibility and following the official Python deprecation process.
Equally, saying "it's not in the manual, so tough luck" is unreasonably harsh. We need to be reasonable about this.
Do you have any reason why you would need to get a list of only slots, or only dict-based attributes?
Yes. Dict-based attributes always have values, while slot-based attributes can be unset and raise AttributeErrors when trying to access them.
Hmm. I could argue this a couple of ways. Slots should contain None when unassigned (no AttributeErrors), or code should be (re-)written to cope with AttributeError from things in dir(). I wouldn't argue it as "slots can raise AttributeError, so we need to treat slots and dict-based attibutes separately, in 2 passes".
here is how I would handle pickling (excerpt from pickle.py):
try: getstate = object.__getstate__ except AttributeError: stuff = object.__dict__
# added to support slots if hasattr(object.__slots__): for slot in object.__slots__: if hasattr(object, slot): stuff[slot] = getattr(object, slot)
else: stuff = getstate() _keep_alive(stuff, memo) save(stuff) write(BUILD)
Why not just change the line stuff = object.__dict__ to stuff = [a for a in dir(object) if hasattr(object,a) and not callable(getattr(object,a))] [The hasattr() gets rid of unbound slots - this may be another argument for unbound slots containing None, and the callable() gets rid of methods]. Then slots are covered, as well as any future non-dict-based attribute types. If vars(obj) was fixed to include slots, like dir() was, then this could be reduced to "stuff = vars(object)" (modulo protection against AttributeError).
I'm not suggesting anything more incestuous and low-level than what is already done in many, many, many places. A larger, more-encompassing proposal is definitely welcome.
I'm not sure we need a larger proposal. A smaller one may work better. I'm arguing above for 1. Make unbound slots return None rather than AttributeError 2. Make vars() return slots as well as dict-based attributes 3. Document __dict__ as legacy usage, not slots-aware 4. Fix bugs caused by use of the legacy __dict__ approach 5. Educate users in the new approaches which are slots-aware (dir/vars, calling base class setattr, etc) (and maybe a sixth, don't make __slots__ a reflection API - make it an implementation detail, a bit like __dict__ is now viewed)
Well, I've not found resounding agreement on the first two of my three basic issues/bugs I've raised so far:
1) Flat slot namespaces: Objects should not hiding slots when inherited by classes implementing the same slot name.
You're right - I'm not in "resounding" agreement. I think it's probably better, for consistency with dict-based attributes, but I sort of wish it wasn't. (The fact that I've not hit problems because of the equivalent property of attributes means that I'm probably wrong, and attributes are fine as they are, though.)
2) Flat slot descriptions: object.__slots__ being an immutable flat tuple of all slot names (including inherited ones), as opposed to being a potentially mutable sequence of only the slots defined by the most derived class.
This I disagree with. I think __slots__ should be immutable. But I'm happy with "don't do that" as the way of implementing immutability, if that's what Guido prefers. I definitely don't think __slots__ should return something different when you read it than what you assigned to it in the first place (which is what including inherited slots does). But I don't really think people have any business reading __slots__ in any case (see my arguments above).
3) First class status for slot reflection: making slots picklable by default, returned by vars(object), and made part of other relevant reflection APIs and standard implementations.
I agree on this one. Paul.
On Thu, 21 Feb 2002, Moore, Paul wrote:
I agree, but that "official" support has clear limitations.
I'm not sure what you mean.
When you request dir(object), there is a fairly significant amount of work done. Even though it is implemented in C, I can foresee a non-trivial performance hit to a great deal of code. vars(object) is better, though it also has some performance implications. I know that we are talking about Python, and performance is not of paramount importance. But realize that my company produces _extremely_ large Python applications used for financial and business tracking and forecasting. We are acutely aware of bottle-necks in critical paths such as object serialization. I'm just not looking forward to 25% slowdowns in pickling (number pulled out of hat) and I'm sure the Zope guys aren't either...
Right now, there are several examples in the Python standard library where we use obj.__dict__.keys() -- most significantly in pickle and cPickle.
But aren't we agreed that this is the source of a bug (that slots aren't picklable)?
That is not the bug -- if for no other reason, the standard library is free to use implementation specific knowledge. Getting obj.__dict__ is a really slick and efficient way to reflect on all normal instance variables.
Naive, maybe, but saying "undocumented" is equivalent to "unsupported implementation detail" saves us from having to maintain backward compatibility and following the official Python deprecation process.
Equally, saying "it's not in the manual, so tough luck" is unreasonably harsh. We need to be reasonable about this.
I don't mean to imply that we need to be harsh, though in some classes we do not want to worry about backward compatibility. How do we tell users which features are safe to use, so that they don't write thousands of lines of code that suddenly break when the next Python version is released? Well, not documenting it in the official Python reference manual is a pretty good way. Personally, I'm extremely wary of using anything that isn't. Such features can also be documented in the reference manual and explicitly marked as "subject to change", but that is not the case we are currently dealing with.
Yes. Dict-based attributes always have values, while slot-based attributes can be unset and raise AttributeErrors when trying to access them.
Hmm. I could argue this a couple of ways. Slots should contain None when unassigned (no AttributeErrors), or code should be (re-)written to cope with AttributeError from things in dir(). I wouldn't argue it as "slots can raise AttributeError, so we need to treat slots and dict-based attibutes separately, in 2 passes".
I don't see how filling slots with default values is compatible with the premise that we want slots to act as close to normal instance attributes as possible. I've implemented quite a few things with slots. In fact, I have an experimental branch of a 200k LOC project that re-implements many low level components using slots. The speed-ups and memory savings from doing so are very, very compelling. There are cases where I declare slots that may never be used using any particular instance lifetime. I do expect them to raise an AttributeError, just like they would have before they were slots. If I fill the slot, assigning it to None has another very different semantic meaning than an AttributeError. Another good example is pickling. Why would you ever want to pickle empty slots? They have no value, not even a default one, so why waste the processor cycles or the disk space?
Why not just change the line stuff = object.__dict__ to
stuff = [a for a in dir(object) if hasattr(object,a) and not callable(getattr(object,a))]
Um, because its wrong? I pickle _lots_ of callable objects. It also pickles class-attributes as instance-attributes. Here is a better version that can be used once vars(object) has been fixed: stuff = dict([ (a,getattr(object,a)) for a in vars(object) if hasattr(object,a)]) Note that it does an unnecessary getattr, hasattr, memory allocation and incurs loop overhead on every dict attribute, but otherwise it should work once vars is fixed.
1. Make unbound slots return None rather than AttributeError
I am strongly against this. It doesn't make sense to start supplying implicit default values. Explicit is better than implicit...
2. Make vars() return slots as well as dict-based attributes
Agree.
3. Document __dict__ as legacy usage, not slots-aware
Agree, though __dict__ should still be a valid way of accessing all non-slot instance attributes. Too much legacy code would break if this were not so.
4. Fix bugs caused by use of the legacy __dict__ approach
I'd rephrase that as: fix reflection code that assumes attributes only live in __dict__.
5. Educate users in the new approaches which are slots-aware (dir/vars, calling base class setattr, etc)
Calling base class setattr? I'm not sure what you mean?
2) Flat slot descriptions: object.__slots__ being an immutable flat tuple of all slot names (including inherited ones), as opposed to being a potentially mutable sequence of only the slots defined by the most derived class.
This I disagree with. I think __slots__ should be immutable. But I'm happy with "don't do that" as the way of implementing immutability, if that's what Guido prefers.
Not knowing what Guido is planning, all I can say is that he has made __bases__ and __mro__ explicitly immutable. If we now intend to make __slots__ immutable as well, then it is better to do so explicitly.
I definitely don't think __slots__ should return something different when you read it than what you assigned to it in the first place (which is what including inherited slots does). But I don't really think people have any business reading __slots__ in any case (see my arguments above).
By your logic, people don't have any business reading __dict__, but they do. Imagine what would happen if we didn't expose __dict__ in Python 2.3? Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
participants (2)
-
Kevin Jacobs
-
Moore, Paul