RE: [Python-Dev] Meta-reflections
From: Kevin Jacobs [mailto:jacobs@penguin.theopalgroup.com]
On Thu, 21 Feb 2002, Moore, Paul wrote:
I agree, but that "official" support has clear limitations.
I'm not sure what you mean.
When you request dir(object), there is a fairly significant amount of work done. [...] I know that we are talking about Python, and performance is not of paramount importance.
Hmm. I tend to favour "do it right, then do it fast". If there's a performance hit on dir(), why can't it be made faster? If nothing else, as a part of the core, dir() has the right to access __dict__ and __slots__. So there's no a priori reason why dir() should be slower than *any* user-coded way of doing the same. Of course, we *really* want vars() here, as we're otherwise doing work in dir() to get entries that we then throw away. But that's the only issue. Get vars() to work, and if it's too slow, you can argue that it's a bug because "I can get the same results by using the following code, which is faster".
I'm just not looking forward to 25% slowdowns in pickling (number pulled out of hat) and I'm sure the Zope guys aren't either...
There's bound to be some slowdown, from the (new) need to find slots as well as dict-based attributes. I'm happy if you want it minimised. But that's a new point you've raised, which I don't have the expertise to comment on.
That is not the bug -- if for no other reason, the standard library is free to use implementation specific knowledge. Getting obj.__dict__ is a really slick and efficient way to reflect on all normal instance variables.
I'm not sure I agree here - it's better if the standard library uses interfaces which are available to the user. And if pickling can be made fast, why shouldn't the machinery that makes this possible be made available to the end user? You could say that this argues in favour of making __dict__ and __slots__ part of the "official" reflection API. My view is that it argues for making the "official" API (which I'm assuming will be vars() for now) efficient enough that people don't need to use __disct__ and __slots__. Encapsulation is good.
I don't see how filling slots with default values is compatible with the premise that we want slots to act as close to normal instance attributes as possible.
Fair enough. I offered that as one option. Clearly you prefer the other (that what's in dir() and/or __slots__ cannot be guaranteed not to raise AttributeError). I'm happy either way, not having a vested interest in the issue.
Why not just change the line stuff = object.__dict__ to
stuff = [a for a in dir(object) if hasattr(object,a) and not callable(getattr(object,a))]
Um, because its wrong?
Sorry - it was an off-the-top-of-the-head suggestion. But it made my real point, which was that you can do it with dir().
stuff = dict([ (a,getattr(object,a)) for a in vars(object) if hasattr(object,a)])
Note that it does an unnecessary getattr, hasattr, memory allocation and incurs loop overhead on every dict attribute, but otherwise it should work once vars is fixed.
Efficiency again. I'd have to bow to your greater experience here. Although with pickling, doesn't I/O usually outweigh any performance cost?
3. Document __dict__ as legacy usage, not slots-aware
Agree, though __dict__ should still be a valid way of accessing all non-slot instance attributes. Too much legacy code would break if this were not so.
That's what I meant. Document it as the historical way of getting at instance attributes. Still available, but code which uses it will not support slots. After all, if you pass classes using slots into code which uses __dict__, things will go wrong. That's just another sort of breakage. Nobody's arguing that __dict__ should go away. Except possibly from the documentation :-)
Calling base class setattr? I'm not sure what you mean?
It's in the part of the descrintro document I pointed you at. Traditional implementations of setattr used assignment to self.__dict__['attr'] to avoid infinite recursion. The "new way" discussed in the descrintro document is to call the base class setattr.
By your logic, people don't have any business reading __dict__, but they do.
They don't have any business *any more*. An important distinction. (And it's not anywhere near as black and white as that comment implies - I know that).
Imagine what would happen if we didn't expose __dict__ in Python 2.3?
Nothing at all, if we provide an alternative. Except for backward compatibility issues, which there's a well-documented deprecation process to address. Of course, nobody is proposing the removal of __dict__. All I'm suggesting is that we document its limitations, point out better ways, and leave it at that. Paul.
On Thu, 21 Feb 2002, Moore, Paul wrote:
Hmm. I tend to favour "do it right, then do it fast". If there's a performance hit on dir(), why can't it be made faster? [snip] Of course, we *really* want vars() here, as we're otherwise doing work in dir() to get entries that we then throw away.
dir(object) simply doesn't do what we want. I've tried several times to write a correct pickler using dir(object) and have always run into problems due to pathological corner-cases. I encourage you to try your hand at it. In the process I've found another issue with the slots implementation. I'll post the details to python-dev in a separate e-mail.
Note that it does an unnecessary getattr, hasattr, memory allocation and incurs loop overhead on every dict attribute, but otherwise it should work once vars is fixed.
Efficiency again. I'd have to bow to your greater experience here. Although with pickling, doesn't I/O usually outweigh any performance cost?
I can't speak for everyone's applications, but we frequently pickle to memory or to the operating system buffer-cache don't live long enough to hit the disk. Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
[Kevin Jacobs]
In the process I've found another issue with the slots implementation. I'll post the details to python-dev in a separate e-mail.
FYI bug reported only on python-dev have a high probability to get lost into vacuum (Tim often warns against that). Now a seemingly bug is a seeminhly bug, so I have reported your bug to SF: http://sourceforge.net/tracker/index.php?func=detail&aid=520644&group_id=5470&a tid=105470 In general don't expect that someone will post bugs on your behalf. regards, Samuele Pedroni.
On Thu, 21 Feb 2002, Samuele Pedroni wrote:
[Kevin Jacobs]
In the process I've found another issue with the slots implementation. I'll post the details to python-dev in a separate e-mail.
FYI bug reported only on python-dev have a high probability to get lost into vacuum (Tim often warns against that).
Now a seemingly bug is a seemingly bug, so I have reported your bug to SF:
http://sourceforge.net/tracker/index.php?func=detail&aid=520644&group_id=5470&a tid=105470
In general don't expect that someone will post bugs on your behalf.
Thanks. I have a collection of about ~8 more bugs that is expending as I grow my test suite. Before I spray all of them onto SF, I want to hear from Guido, since some of my "bugs" are potentially subjective. I _have_ tried three times to post a summary-bug to SF and its not worked (as usual). Is just me or is SF flaky as hell? The last time I tried to post a bug, it kicked me out and was "Down for maintenance" for some time after that. Now it won't let me login since it thinks I haven't responded to the new account confirmation e-mail. Grrrrrrrrrr -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
[Kevin Jacobs]
... I have a collection of about ~8 more bugs that is expending as I grow my test suite. Before I spray all of them onto SF, I want to hear from Guido, since some of my "bugs" are potentially subjective.
The best way to hear from Guido is to post bugs, and suspected bugs, to SourceForge, one bug per report. There's so much verbiage about this now on Python-Dev that I doubt he'll ever be able to make time to catch up with it when he returns. A great advantage of a good bug report is that it's focused and brief. Slots were definitely intended as a memory optimization, and the ways in which they don't act like "regular old attributes" are at best warts.
I _have_ tried three times to post a summary-bug to SF and its not worked (as usual). Is just me or is SF flaky as hell? The last time I tried to post a bug, it kicked me out and was "Down for maintenance" for some time after that. Now it won't let me login since it thinks I haven't responded to the new account confirmation e-mail. Grrrrrrrrrr
It *sounds* like you're getting started with SF. Once it agrees not to hate you <wink>, life gets a lot easier. It's not flaky in general, but it does suffer bouts of extreme flakiness from time to time.
[Kevin Jacobs]
... I have a collection of about ~8 more bugs that is expending as I grow my test suite. Before I spray all of them onto SF, I want to hear from Guido, since some of my "bugs" are potentially subjective.
The best way to hear from Guido is to post bugs, and suspected bugs, to SourceForge, one bug per report. There's so much verbiage about this now on Python-Dev that I doubt he'll ever be able to make time to catch up with it when he returns. A great advantage of a good bug report is that it's focused and brief.
It's very true.
Slots were definitely intended as a memory optimization, and the ways in which they don't act like "regular old attributes" are at best warts.
I see, but it seems that the only way to coherently and transparently remove the warts implies that the __dict__ of a new-style class instance with slots should be tied with the instance and cannot be anymore a vanilla dict. Something only Guido can rule about. some-more-verbiage-ly y'rs - Samuele.
FWIW, some of my Boost colleagues have been watching SF's future prospects with some suspicion. The financial outlook is worrisome; I submitted a support request in April 2001 that still hasn't been addressed ( http://sourceforge.net/tracker/?func=detail&aid=414066&group_id=1&atid=35000 1). We're establishing all new services elsewhere, and even moving some old ones. For the long-term health of Python, you might want to make sure you're prepared to move quickly if neccessary. -Dave ----- Original Message ----- From: "Tim Peters" <tim.one@comcast.net> To: "'Python Dev'" <python-dev@python.org> Sent: Thursday, February 21, 2002 4:06 PM Subject: RE: [Python-Dev] Meta-reflections
[Kevin Jacobs]
... I have a collection of about ~8 more bugs that is expending as I grow my test suite. Before I spray all of them onto SF, I want to hear from Guido, since some of my "bugs" are potentially subjective.
The best way to hear from Guido is to post bugs, and suspected bugs, to SourceForge, one bug per report. There's so much verbiage about this now on Python-Dev that I doubt he'll ever be able to make time to catch up with it when he returns. A great advantage of a good bug report is that it's focused and brief.
Slots were definitely intended as a memory optimization, and the ways in which they don't act like "regular old attributes" are at best warts.
I _have_ tried three times to post a summary-bug to SF and its not worked (as usual). Is just me or is SF flaky as hell? The last time I tried to post a bug, it kicked me out and was "Down for maintenance" for some time after that. Now it won't let me login since it thinks I haven't responded to the new account confirmation e-mail. Grrrrrrrrrr
It *sounds* like you're getting started with SF. Once it agrees not to hate you <wink>, life gets a lot easier. It's not flaky in general, but it does suffer bouts of extreme flakiness from time to time.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev
[David Abrahams]
FWIW, some of my Boost colleagues have been watching SF's future prospects with some suspicion.
It's worth a lot, and we do too -- at least in fits, when somebody remembers it's something that's going to kill us someday.
The financial outlook is worrisome; I submitted a support request in April 2001 that still hasn't been addressed (
<http://sourceforge.net/tracker/?func=detail&aid=414066&group_id=1&atid=3500 01>). Well, that's really a feature request, and *nobody* responds well to witty oblique references to the Odyssey except me <wink>.
We're establishing all new services elsewhere, and even moving some old ones. For the long-term health of Python, you might want to make sure you're prepared to move quickly if neccessary.
We supposedly have a cron job set up to suck down Python's CVS tarball every night (the people who would know if this is currently working are out this week). What I don't think we ever figured out how to do was capture the info in the trackers (bugs, patches, feature requests). That would be a major loss, as well as a chance to forget about 500 people who can't figure out how to use threads on HP-UX, so let's call it a wash <wink>.
[Tim]
Slots were definitely intended as a memory optimization, and the ways in which they don't act like "regular old attributes" are at best warts.
[Samuele Pedroni]
I see, but it seems that the only way to coherently and transparently remove the warts implies that the __dict__ of a new-style class instance with slots should be tied with the instance and cannot be anymore a vanilla dict. Something only Guido can rule about.
He'll be happy to <wink>. Optimizations aren't always wart-free, and then living with warts is a price paid for benefiting from the optimization. I'm sure Guido would consider it "a bug" if slots are ignored by the pickling mechanism, but wouldn't for an instant consider it "a bug" that the set of slots in effect when a class is created can't be dynamically expanded later (this latter is more a sensible restriction than a wart, IMO -- and likely in Guido's too).
What I don't think we ever figured out how to do was capture the info in the trackers (bugs, patches, feature requests). That would be a major loss, as well as a chance to forget about 500 people who can't figure out how to use threads on HP-UX, so let's call it a wash <wink>.
From: Tim Peters <tim.one@comcast.net>
[Tim]
Slots were definitely intended as a memory optimization, and the ways in which they don't act like "regular old attributes" are at best warts.
[Samuele Pedroni]
I see, but it seems that the only way to coherently and transparently remove the warts implies that the __dict__ of a new-style class instance with slots should be tied with the instance and cannot be anymore a vanilla dict. Something only Guido can rule about.
He'll be happy to <wink>. Optimizations aren't always wart-free, and then living with warts is a price paid for benefiting from the optimization. I'm sure Guido would consider it "a bug" if slots are ignored by the pickling mechanism, but wouldn't for an instant consider it "a bug" that the set of slots in effect when a class is created can't be dynamically expanded later (this latter is more a sensible restriction than a wart, IMO -- and likely in Guido's too).
I was thinking along the line of the C equiv of this: [Yup the situation of a subclass of a class with slots is more relevant] class C(object): __slots__ = ['_a'] class D(C): pass def allslots(cls): mro = list(cls.__mro__) mro.reverse() allslots = {} for c in mro: cdict = c.__dict__ if '__slots__' in cdict: for slot in cdict['__slots__']: allslots[slot] = cdict[slot] return allslots class slotdict(dict): __slots__ = ['_inst','_allslots'] def __init__(self,inst,allslots): self._inst = inst self._allslots = allslots def __getitem__(self,k): if self._allslots.has_key(k): # self _allslots should be reachable as self._inst.__class__.__allslots__ # AttributeError should become a KeyError ? return self._allslots[k].__get__(self._inst) else: return dict.__getitem__(self,v) def __setitem__(self,k,v): if self._allslots.has_key(k): # self _allslots should be reachable as self._inst.__class__.__allslots__ # AttributeError should become a KeyError ? return self._allslots[k].__set__(self._inst,v) else: return dict.__setitem__(self,v) # other methods accordingly d=D() d.__dict__ = slotdict(d,allslots(D)) # should be so automagically # allslots(D) should be probably accessible as d.__class__.__allslots__ # for transparency C.__dict__ should not contain any slot descr # __allslots__ should be readonly and disallow rebinding # d.__dict__ should disallow rebinding # c =C() ; c.__dict__ should return a proxy dict lazily or even more so ... Lots of things to rule about and trade-offs to consider. the-more-it's-arbitrary-the-more-you-need-_one_-ruler-ly y'rs - Samuele.
[Guido]
From a recent SF mailing to project administrators:
DATA EXPORT ---------------------------
Jeremy (and less so I) played with that in the past (before it was publicized), but hit a brick wall: there seemed to be a cap on how many records it would deliver, and we couldn't brute-force our way around it. Maybe it's better now.
... SOMEBODY with admin perms should set up a cron job to such down the nightly XML. It's big! (Are we still sucking down the nightly cvs tarballs? We should!)
IIRC, Barry was doing that on a home machine, and if so he's not around this week to answer.
Guido van Rossum writes:
SOMEBODY with admin perms should set up a cron job to such down the nightly XML. It's big! (Are we still sucking down the nightly cvs tarballs? We should!)
It's failing for me now; I'll submit a support request. I think the tarballs are being downloaded to the python.org machine; I'm not sure if they're still landing on Barry's home machine. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation
I wrote:
It's failing for me now; I'll submit a support request.
http://sourceforge.net/tracker/index.php?func=detail&aid=521302&group_id=1&atid=200001 -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation
[Guido]
SOMEBODY with admin perms should set up a cron job to such down the nightly XML.
[Fred]
It's failing for me now; I'll submit a support request.
It doesn't crap out for me, but this is the entire file I get back: """ <project_export> <artifacts> """ Yes, I was logged in as an admin at the time. Else I get this: """ <project_export> You are not an admin of this project. Permission denied. """ BTW, from the verbal description of what's supposed to happen, it sounds like it may not include attachments (like patches).
On Fri, 22 Feb 2002, Samuele Pedroni wrote:
I was thinking along the line of the C equiv of this: [...code snipped...]
[ An updated version of a comment to SF issue: http://sourceforge.net/tracker/?func=detail&atid=105470&aid=520644&group_id=5470 ] Samuele's sltattr.py is an interesting approach, though I am not entirely sure it is sufficient to address all of the problems with slots. Here is a mostly complete list of smaller changes that are somewhat orthogonal to how we address accesses to __dict__: 1) Flatten slot lists: Change obj.__class__.__slots__ to return an immutable list of all slot descriptors in the object (including all those of base classes). The motivation for this is similar in spirit to storing a flattened __mro__. The advantages of this change are: a) allows for fast and explicit object reflection that correctly finds all dict attributes, all slot attributes. b) allows reflection implementations (like vars(object) and pickle) to treat dict and slot attrs differently if we choose not to proxy __dict__. This has several advantages, as explained in change #2. Also importantly, this way it is not possible to "lose" descriptors permanently by deleting them from obj.__class__.__dict__. 2) Update reflection API even if we do not choose to proxy __dict__: Alter vars(object) to return a dictionary of all attributes, including both the contents of the non-proxied __dict__ and the valid attributes that result from iterating over __slots__ and evaluating the descriptors. The details of how this is best implemented depend on how we wish to define the behavior of modifying the resulting dictionary. It could be either: a) explicitly immutable, which involves creating proxy objects b) mutable, which involves copying c) undefined, which means implicitly immutable Aside from the questions over the nature of the return type, this implementation (coupled with #1) has distinct advantages. Specifically the native object.__dict__ has a very natural internal representation that pairs attribute names directly with values. In contrast, a fair amount of additional work is needed to extract the slots that store values and create a dictionary of their names and values. Other implementations will require a great deal more work since they would have to traverse though base classes to collecting slot descriptors. 3) Flatten slot inheritance: Update the new-style object inheritance mechanism to re-use slots of the same name, rather than creating a new slot and hiding the old. This makes the inheritance semantics of slots equivalent to those of normal instance attributes and avoids introducing an ad-hoc and obscure method of data hiding. 4) Update standard library to use new reflection API (and make them robust to properies at the same time) if we choose not to proxy __dict__. Virtually all of the changes are simple and involve updating these constructs: a) obj.__dict__ b) obj.__dict__[blah] c) obj.__dict__[blah] = x (What these will become depends on other factors, including the context and semantics of vars(obj).) Here is a fairly complete list of Python 2.2 modules that will need to be updated: copy, copy_reg, inspect, pickle, pydoc, cPickle, Bastion, codeop, dis, doctest, gettext, ihooks, imputil, knee, pdb, profile, rexec, rlcompleter, tempfile, unittest, xmllib, xmlrpclib 5) (NB: potentially controversial and not required) We could alter the descriptor protocol to make slots (and properties) more transparent when the values they reference do not exist. Here is an example to illustrate this: class A(object): foo = 1 class B(A): __slots__ = ('foo',) b = B() print b.foo > 1 or AttributeError? Currently an AttributeError is raised. However, it is a fairly easy change to make AttributeErrors signal that attribute resolution is to continue until either a valid descriptor is evaluated, an instance-attribute is found, or until the resolution fails after search the meta-type, the type, and the instance dictionary. I am prepared to submit patches to address each of these issues. However, I do want feedback beforehand, so that I do not waste time implementing something that will never be accepted. Regards, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
participants (7)
-
David Abrahams
-
Fred L. Drake, Jr.
-
Guido van Rossum
-
Kevin Jacobs
-
Moore, Paul
-
Samuele Pedroni
-
Tim Peters