Re: Of slots and metaclasses...

[Kevin Jacobs wrote me in private to ask my position on __slots__. I'm posting my reply here, quoting his full message -- I see no reason to carry this on as a private conversation. Sorry, Kevin, if this wasn't your intention.]
Hi Kevin, you got me to finally browse the thread "Meta-reflections". My first response was: "you've got it all wrong." My second response was a bit more nuanced: "that's not how I intended it to be at all!" OK, let me elaborate. :-) You want to be able to find out which instance attributes are defined by __slots__, so that (by combining this with the instance's __dict__) you can obtain the full set of attribute values. But this defeats the purpose of unifying built-in types and user-defined classes. A new-style class, with or without __slots__, should be considered no different from a new-style built-in type, except that all of the methods happen to be defined in Python (except maybe for inherited methods). In order to find all attributes, you should *never* look at __slots__. Your should search the __dict__ of the class and its base classes, in MRO order, looking for descriptors, and *then* add the keys of the __dict__ as a special case. This is how PEP 252 wants it to be. If the descriptors don't tell you everything you need, too bad -- some types just are like that. For example, if you're deriving from a list or tuple, there's no attribute that leads to the items: you have to use __len__ and __getitem__ to find out about these, and you have to "know" that that's how you get at them (although the presence of __getitem__ should be a clue). Why do I reject your suggestion of making __slots__ (more) usable for introspection? Because it would create another split between built-in types and user-defined classes: built-in types don't have __slots__, so any strategy based on __slots__ will only work for user-defined types. And that's exactly what I'm trying to avoid! You may complain that there are so many things to be found in a class's __dict__, it's hard to tell which things are descriptors. Actually, it's easy: if it has a __get__ (method) attribute, it's a descriptor; if it also has a __set__ attribute, it's a data attribute, otherwise it's a method. (Note that read-only data attributes have a descriptor that has a __set__ method that always raises TypeError or AttributeError.) Given this viewpoint, you won't be surprised that I have little desire to implement your other proposals, in particular, I reject all these: - Proxy the instance __dict__ with something that makes the slots visible - Flatten slot lists and make them immutable - Alter vars(obj) to return a dict of all attrs - Flatten slot inheritance (see below) - Change descriptors to fall back on class variables for unfilled slots I'll be the first to admit that some details are broken in 2.2. In particular, the fact that instances of classes with __slots__ appear picklable but lose all their slot values is a bug -- these should either not be picklable unless you add a __reduce__ method, or they should be pickled properly. This is a bug of the same kind as the problem with pickling time.localtime() (SF bug #496873), so I'm glad this problem has now been entered in the SF database (as #520644). I haven't made up my mind on how to fix this -- it would be nice if __slots__ would automatically be pickled, but it's tricky (although I think it's doable -- without ever referencing the __slots__ variable :-). I'm not so sure that the fact that you can "override" or "hide" slots defined in a base class should be classified as a bug. I see it more as a "don't do that" issue: If you're deriving a class that overrides a base class slot, you haven't done your homework. PyChecker could warn about this though. I think you're mostly right with your proposal "Update standard library to use new reflection API". Insofar as there are standard support classes that use introspection to provide generic services for classic classes, it would be nice of these could work correctly for new-style classes even if they use slots or are derived from non-trivial built-in types like dict or list. This is a big job, and I'd love some help. Adding the right things to the inspect module (without breaking pydoc :-) would probably be a first priority. Now let me get to the rest of your letter.
Wow. That's more than I've ever managed (due to what I hope can still be called a mild case of ADD :-). But I think I studied all the important parts. (I should ask the authors for a percentage -- I think they've made quite some sales because of my frequent quoting of their book. :-)
Maybe you can formulate it as a set of tentative clarifying patches to PEPs 252, 253, and 254?
Not much more than what I've done so far. A lot of what they describe is awfully C++ specific anyway; a lot of the things they struggle with (such as the redispatch hacks and requestFirstCooperativeMethodCall) can be done so much simpler in a dynamic language like Python that I doubt we should follow their examples literally.
2) In Python 2.2, what intentional deviations have you chosen from the SOMMCP and what differences are incidental or accidental?
Hard to say, unless you specifically list all the things that you consider part of the SOMMCP. Here are some things I know: - In descrintro.html, I describe a slightly different algorithm for calculating the MRO than they use. But my implementation is theirs -- I didn't realize the two were different until it was too late, and it only matters in uninteresting corner cases. - I currently don't complain when there are serious order disagreements. I haven't decided yet whether to make these an error (then I'd have to implement an overridable way of defining "serious") or whether it's more Pythonic to leave this up to the user. - I don't enforce any of their rules about cooperative methods. This is Pythonic: you can be cooperative but you don't have to be. It would also be too incompatible with current practice (I expect few people will adopt super().) - I don't automatically derive a new metaclass if multiple base classes have different metaclasses. Instead, I see if any of the metaclasses of the bases is usable (i.e. I don't need to derive one anyway), and then use that; instead of deriving a new metaclass, I raise an exception. To fix this, the user can derive a metaclass and provide it in the __metaclass__ variable in the class statement. I'm not sure whether I should automatically derive metaclasses; I haven't got enough experience with this stuff to get a good feel for when it's needed. Since I expect that non-trivial metaclasses are often implemented in C, I'm not so comfortable with automatically merging multiple metaclasses -- I can't prove to myself that it's always safe. - I don't check that a base class doesn't override instance variables. As I stated above, I don't think I should, but I'm not 100% sure.
3) Do you intend to enforce monotonicity for all methods and slots? (Clearly, this is not desirable for instance __dict__ attributes.)
If I understand the concept of monotonicity, no. Python traditionally allows you to override methods in ways that are incompatible with the contract of the base class method, and I don't intend to forbid this. It would be good if PyChecker checked for accidental mistakes in this area, and maybe there should be a way to declare that you do want this enforced; I don't know how though. There's also the issue that (again, if I remember the concepts right) there are some semantic requirements that would be really hard to check at compile time for Python.
4) Should descriptors work cooperatively? i.e., allowing a 'super' call within __get__ and __set__.
I don't think so, but I haven't thought through all the consequences (I'm not sure why you're asking this, and whether it's still a relevant question after my responses above). You can do this for properties though. Thanks for the dialogue! --Guido van Rossum (home page: http://www.python.org/~guido/)

On Thu, 28 Feb 2002, Guido van Rossum wrote:
No problem -- I sent it privately only to spare python-dev if you happened to be too busy for a coherent reply.
Yes -- I can see why my initial efforts of making slots work "just like __dict__ attributes" is a bad idea. However, it took reading 'Putting Metaclasses to Work' for me to realize that.
I suppose the purpose of unifying built-in types and user-defined classes is rather subjective. There are many roads that will get us there, and I happened to fixate on another one...
Sure. Except that I also want to be able to extend existing new-style classes/types in C, as well as Python. Here is how I do it now (minus error checking and ref-counting): static PyMethodDef PyRow_methods[] = { {"__init__", (PyCFunction)rowinit, METH_VARARGS}, {"__repr__", (PyCFunction)rowstrrepr, METH_NOARGS }, {"__getitem__", (PyCFunction)rowgetitem, METH_VARARGS} /* etc... */ } PyRow_Type = (PyTypeObject*)PyType_Type.tp_call((PyObject*)&PyType_Type,args, NULL) /* Methods must be added _after_ PyRow_Type has been created since the type is an argument to PyDescr_NewMethod */ dict = PyRow_Type->tp_dict; meth = PyRow_methods; for (; meth->ml_name != NULL; meth++) { PyObject* method = PyDescr_NewMethod(PyRow_Type, meth); PyDict_SetItemString(dict,meth->ml_name,method); } Though this doesn't look nearly as ugly as it did when I first wrote it, before I read 'Putting Metaclasses to Work'; strangely enough it ends up looking a lot like their metaclass interface.
Sure. I was just hoping to have that list of descriptors pre-computed and stored in the class (like __mro__). I suppose the question is why even expose __slots__ if it is so worthless?
If the descriptors don't tell you everything you need, too bad -- some types just are like that.
This has _never_ been a concern of mine -- I don't mind if the C implementation chooses to hide things.
Well, I'm busing creating C extension types that *do* have slots! One of my many current projects is to create a better type to store the results of relational database queries. I want the memory efficiency of tuples and the ability to query by name (via __getitem__ or __getattr__). So I basically need to re-invent a magic tuple type that adds descriptors for every named field. Strangely enough, this is basically what the slots mechanism does. I do realize that I could accomplish the same end by sub-classing tuple and adding a bunch of descriptors.
I wasn't real thrilled with this idea myself. Among all the other reasons why not to do this, it has some terrible performance implications.
- Flatten slot lists and make them immutable
Again, why even have __slots__ if they are so useless? Assuming that there is a legitimate reason to peek at __slots__, why not at least make them immutable? Or, even better, why not use __slots__ to expose the etype slot tuple instead?
- Alter vars(obj) to return a dict of all attrs
Ok, I'm a little baffled by this. Why not?
My vote is that they should be pickled properly by default. In my mind, slots are a more static type of attribute. Since they are more static, my feeling is that they should be as or more accessible than dict attributes. Descriptors are fine for handing the black magic of making them addressable by name, but it just feels wrong to hide them from access by other means. Of course, I am really talking about slots defined at the Python level -- not necessarily all storage allocated in the 'members' array.
Unless attribute access becomes scoped based on the static type of the method, then I think it is a bug. Re-declared slots become effectively orphaned and just waste memory. Coalescing them or raising an exception when they are re-declared seem much better alternatives.
Well, I'm happy to contribute, though my primary concern (other than correctness and completeness) is efficiency. The whole reason I'm using slots is to save space when allocating huge numbers of fairly small objects. I believe that there is a big performance difference between being able to pickle based on arbitrary descriptors and pickling just slots. Slots are already nicely laid out in rows, just waiting to be plucked out and stuffed into a pickle. Even without flattened __slots__ lists, it is a fast and trivial operation to iterate over a class and all its bases and extract slots. Doing so over dictionaries is not nearly so trivial.
Maybe you can formulate it as a set of tentative clarifying patches to PEPs 252, 253, and 254?
To be honest, I forgot that those PEPs existed! I've been working off of the Python 2.2 source and the tutorials. I'll read them over tonight and see.
When I say SOMMCP, I really mean the "metaclass protocol" defined by the various postulates and theorems in the first few chapters of the book.
Sure -- I noticed this. Maybe you should store the order-safety in the metaclass? That way, the user can inspect it when they decide it is important.
I agree with most of that, except that I expect that MANY people will start using 'super'. I've trained an office full of Java programmers to program in Python and they are always complaining about the lack of super calls. Also, I've _always_ considered this idiom ugly and hackish: def Foo(Bar,Baz): def __init__(self): Bar.__init__(self) Baz.__init__(self) Its so much better as: def Foo(Bar,Baz): def __init__(self): # when super becomes a keyword and we write nice cooperative __init__ # methods super.__init__(self)
- I don't automatically derive a new metaclass if multiple base classes have different metaclasses.
I have my own ideas about this, but like you, don't have enough experience with them in practice to do anything about it.
It is always safe when the assumption of monotonicity is not violated.
Do you mean slots or all Python instance attributes in this statement?
For Python, monotonicity means that the instance attributes and instance methods of a class are a superset of those of all its ancestors. This is not the way that normal __dict__ attributes work in Python, so lets talk only about slots when discussing monotonic properties. In order words, it means that the metaclass interface does not provide a way to delete a slot or a method, only ways to add and override them. Combined with some static type information, the assumption of monotonicity will be very helpful when we can eventually compile Python.
I have a pretty good idea how. Its essentially a proof-based method that works by solving metatype constraints.
True for __dict__ instance attributes, not for slots!
class Foo(object): __slots__=() a = 1 class Bar(Foo): __slots__ = ('a',) bar = Bar() print dir(a) print a The resolution rule for descriptors could work cooperatively to find Foo's class attribute 'a' instead of giving up with an AttributeError. Thanks for the very useful answers, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com

[me]
[Kevin]
Heh?!?!!! Why can't you declare PyRow_Type as a statically initialized struct like all extensions and the core do? [snip]
Sure. I was just hoping to have that list of descriptors pre-computed and stored in the class (like __mro__).
__mro__ gets used *all the time*; on every method lookup at least. The list of instance variable descriptors is only interesting to a small number of highly introspective tools.
I suppose the question is why even expose __slots__ if it is so worthless?
It's found in the dict when the class is defined. Why delete it? The idea is that you can make it a dict that has other info about the slots. It's got a __foo__ name. I can give it any semantics I damn well please. :-)
Exactly, and I'm telling you to have the same attitude about slots. Let me repeat something I just sent someone else about slots: It seems that unfortunately __slots__ is Python 2.2's most misunderstood feature... I see it as a hack that lets me define a special-purpose class whose instances are (almost) as efficient as I can do using C, but without having to write a C extension. (I say "almost", because a C extension can store simple values as C ints, while __slots__ only lets you store PyObject pointers. But still, it's a big savings compared to adding a __dict__ to every instance, and sometimes the slot value is picked from a small number of interned or cached ints or strings.) It has different semantics from regular attributes, and I don't try to hide that: introspection doesn't find slots the same way as it finds regular instance vars, you can't provide a default via a class variable, and there are a bunch of "don't do that" things like modifying __slots__ of an existing class or overriding a slot defined by a base class. (There's a whole list of warnings in http://www.python.org/2.2/descrintro.html!) I think as such, the feature is just right (except for the no-pickling bug). It's unfortunate that people have jumped on it as the answer to all their questions. I guess that means there's a big demand for more control over instance variables -- whether that demand is created by a real need or simply because that's how most other languages do it remains to be seen...
Note that there's something already there that you might reuse: Objects/structseq.c, which is used to create the return values of localtime(), stat() and a few others in a way that looks both like a tuple and like a read-only record. It may not be powerful enough because I think the assumption is that the set of field names is static, but you may be able to extend it or copy some good ideas. (Just don't try to understand what it does to make the tuple shorter than the record in some cases -- that's for backwards compatibility because lots of code would break if e.g. struct() returned a longer tuple than in previous Python versions, but we still want to provide new fields when using named fields. This part is not for the weak of heart, and I didn't write it, and can't guarantee that it's 100% bugfree.) [items I rejected]
- Alter vars(obj) to return a dict of all attrs
Ok, I'm a little baffled by this. Why not?
Currently, the assumption is that vars() returns a dict that can be modified to modify the underlying object's attributes. If it were to return a synthetic dict, that wouldn't work, or it would require more implementation effort than I care for -- again, since I doubt there is much demand for this outside a small set of introspection tools.
Slots share their descriptor implementation with anything defined by the tp_members array in a type object. E.g. file.softspace is a descriptor of the same type as used by slots. What they share is that they refer to "real" data stored in the instance -- either a PyObject* or some basic C type like int or double. I don't want to trust that __slots__ has the right data: even if I made it immutable, someone could still do C.__dict__['__slots__'] = <whatever>, and I don't want to go so far as to make __slots__ a property stored in the type object. So I can't really tell which descriptors are slots and which are other things -- and I don't want to, because I believe that would be breaking through an abstraction.
It's a bug to redeclare a slot. I don't find it Python's job to make it an error.
I think you're overstating the simplicity of pickling slots. There is no guarantee that the slots of a derived class are contiguous with the slots of a base class; a __weakref__ and a __dict__ field may be placed in between, and another metaclass could add other things. For example, you could write a metaclass in C that took the __slots__ idea one step further and let you declare the types of the slots as basic C types, so that other structmember keys could be used, e.g. T_INT or T_FLOAT. If you want your instances to be pickled *efficiently*, you should write a custom reduce method in C anyway -- right now, new-style classes are pickled by a piece of Python code at the end of copy_reg.py.
I had a feeling you were missing something basic. :-)
When I say SOMMCP, I really mean the "metaclass protocol" defined by the various postulates and theorems in the first few chapters of the book.
As I said, I don't have the whole set in my head, so you'll have to be more specific in your questions. (Basically, I don't expect to be adding much from the book, but I'll be looking to the book for clues as we find problems with how things are implemented now, e.g. the automatically derived metaclass issue below.)
You mean in the class object? I'm not sure what you mean by "storing the order-safety". I currently don't calculate whether there are any order conflicts: serious_order_disagreements() returns 0 without doing anything. Someone who wants it can easily implement the check from the book though.
I doubt it with the current super(Class,self).method(args) notation. Probably they will once super is a keyword so you can write super.method(args).
Strange that you mention Java in the same paragraph as an example using multiple inheritance. ;-/ Also note that this is pretty much what C++ wants you to do, except it uses '::' instead of '.' and doesn't require you to pass self (which is a different issue). I don't see this as a serious issue, just syntactic sugar.
But that's not what you'd be writing -- you'd be writing super.__init__().
Can you share them? This might be interesting.
And that we can't know.
I just meant slots, but in a sense it's also true for other ivars: if you don't know that your base class defines an ivar 'foo', you might create your own ivar named 'foo' and use it in a way that's inconsistent with the base class. Because there are no type checks and no ivar declarations, that's much harder to avoid in Python than in more static languages like C++ or Java (I assume those will complain when you redefine an ivar, even with the same type).
I'm not sure what you mean by "this is not the way that normal __dict__ attrs work", unless you are talking about overriding __init__ without calling the base class __init__ (and perhaps the same for other methods), which of course can mean that a derived class instance lacks an ivar that a base class instance would have. This is Pythonic freedom IMO. Since it's not true for regular ivars, why worry about it for slots?
I don't think we should be guided here by what might be needed by a compiler. Without actually trying to build a compiler, we'll probably miss important requirements that mean we'll have to change the language anyway, and we'll impose requirements that we think might be important without a good reason. (E.g. structured programming was once thought as an aid to compiler technology as well as to the human reader. Nowadays, optimizers reduce all control flow to labels and goto statements. :-)
Isn't that how most of PyChecker works? At least the proof-base part?
Again, you're trying to hijack slots for purposes for which they weren't created. Think of slots as an efficiency hack, *not* as a better way to declare ivars.
That's a NameError, I suppose you meant 'bar' instead of 'a' in the last two lines, then it makes sense. :-)
Once a descriptor is found, that's the end of the line. When you find a method, you call it, and it raises an exception, you're not going to continue looking for a base class method either! The descriptor type used to implement slots could do this, but doesn't. I don't care about this feature. With a __dict__, there's some real saving in not storing default values, since it means a smaller dict, which can save space. The slot space is always there, so you might as well initialize it. Concluding: don't expect that you can take an arbitrary class, analyze what ivars it uses, and add a __slots__ variable to speed it up. There are lots of differences in semantics when you use slots, and I don't want to hide those. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Thu, 28 Feb 2002, Guido van Rossum wrote:
No problem -- I sent it privately only to spare python-dev if you happened to be too busy for a coherent reply.
Yes -- I can see why my initial efforts of making slots work "just like __dict__ attributes" is a bad idea. However, it took reading 'Putting Metaclasses to Work' for me to realize that.
I suppose the purpose of unifying built-in types and user-defined classes is rather subjective. There are many roads that will get us there, and I happened to fixate on another one...
Sure. Except that I also want to be able to extend existing new-style classes/types in C, as well as Python. Here is how I do it now (minus error checking and ref-counting): static PyMethodDef PyRow_methods[] = { {"__init__", (PyCFunction)rowinit, METH_VARARGS}, {"__repr__", (PyCFunction)rowstrrepr, METH_NOARGS }, {"__getitem__", (PyCFunction)rowgetitem, METH_VARARGS} /* etc... */ } PyRow_Type = (PyTypeObject*)PyType_Type.tp_call((PyObject*)&PyType_Type,args, NULL) /* Methods must be added _after_ PyRow_Type has been created since the type is an argument to PyDescr_NewMethod */ dict = PyRow_Type->tp_dict; meth = PyRow_methods; for (; meth->ml_name != NULL; meth++) { PyObject* method = PyDescr_NewMethod(PyRow_Type, meth); PyDict_SetItemString(dict,meth->ml_name,method); } Though this doesn't look nearly as ugly as it did when I first wrote it, before I read 'Putting Metaclasses to Work'; strangely enough it ends up looking a lot like their metaclass interface.
Sure. I was just hoping to have that list of descriptors pre-computed and stored in the class (like __mro__). I suppose the question is why even expose __slots__ if it is so worthless?
If the descriptors don't tell you everything you need, too bad -- some types just are like that.
This has _never_ been a concern of mine -- I don't mind if the C implementation chooses to hide things.
Well, I'm busing creating C extension types that *do* have slots! One of my many current projects is to create a better type to store the results of relational database queries. I want the memory efficiency of tuples and the ability to query by name (via __getitem__ or __getattr__). So I basically need to re-invent a magic tuple type that adds descriptors for every named field. Strangely enough, this is basically what the slots mechanism does. I do realize that I could accomplish the same end by sub-classing tuple and adding a bunch of descriptors.
I wasn't real thrilled with this idea myself. Among all the other reasons why not to do this, it has some terrible performance implications.
- Flatten slot lists and make them immutable
Again, why even have __slots__ if they are so useless? Assuming that there is a legitimate reason to peek at __slots__, why not at least make them immutable? Or, even better, why not use __slots__ to expose the etype slot tuple instead?
- Alter vars(obj) to return a dict of all attrs
Ok, I'm a little baffled by this. Why not?
My vote is that they should be pickled properly by default. In my mind, slots are a more static type of attribute. Since they are more static, my feeling is that they should be as or more accessible than dict attributes. Descriptors are fine for handing the black magic of making them addressable by name, but it just feels wrong to hide them from access by other means. Of course, I am really talking about slots defined at the Python level -- not necessarily all storage allocated in the 'members' array.
Unless attribute access becomes scoped based on the static type of the method, then I think it is a bug. Re-declared slots become effectively orphaned and just waste memory. Coalescing them or raising an exception when they are re-declared seem much better alternatives.
Well, I'm happy to contribute, though my primary concern (other than correctness and completeness) is efficiency. The whole reason I'm using slots is to save space when allocating huge numbers of fairly small objects. I believe that there is a big performance difference between being able to pickle based on arbitrary descriptors and pickling just slots. Slots are already nicely laid out in rows, just waiting to be plucked out and stuffed into a pickle. Even without flattened __slots__ lists, it is a fast and trivial operation to iterate over a class and all its bases and extract slots. Doing so over dictionaries is not nearly so trivial.
Maybe you can formulate it as a set of tentative clarifying patches to PEPs 252, 253, and 254?
To be honest, I forgot that those PEPs existed! I've been working off of the Python 2.2 source and the tutorials. I'll read them over tonight and see.
When I say SOMMCP, I really mean the "metaclass protocol" defined by the various postulates and theorems in the first few chapters of the book.
Sure -- I noticed this. Maybe you should store the order-safety in the metaclass? That way, the user can inspect it when they decide it is important.
I agree with most of that, except that I expect that MANY people will start using 'super'. I've trained an office full of Java programmers to program in Python and they are always complaining about the lack of super calls. Also, I've _always_ considered this idiom ugly and hackish: def Foo(Bar,Baz): def __init__(self): Bar.__init__(self) Baz.__init__(self) Its so much better as: def Foo(Bar,Baz): def __init__(self): # when super becomes a keyword and we write nice cooperative __init__ # methods super.__init__(self)
- I don't automatically derive a new metaclass if multiple base classes have different metaclasses.
I have my own ideas about this, but like you, don't have enough experience with them in practice to do anything about it.
It is always safe when the assumption of monotonicity is not violated.
Do you mean slots or all Python instance attributes in this statement?
For Python, monotonicity means that the instance attributes and instance methods of a class are a superset of those of all its ancestors. This is not the way that normal __dict__ attributes work in Python, so lets talk only about slots when discussing monotonic properties. In order words, it means that the metaclass interface does not provide a way to delete a slot or a method, only ways to add and override them. Combined with some static type information, the assumption of monotonicity will be very helpful when we can eventually compile Python.
I have a pretty good idea how. Its essentially a proof-based method that works by solving metatype constraints.
True for __dict__ instance attributes, not for slots!
class Foo(object): __slots__=() a = 1 class Bar(Foo): __slots__ = ('a',) bar = Bar() print dir(a) print a The resolution rule for descriptors could work cooperatively to find Foo's class attribute 'a' instead of giving up with an AttributeError. Thanks for the very useful answers, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com

[me]
[Kevin]
Heh?!?!!! Why can't you declare PyRow_Type as a statically initialized struct like all extensions and the core do? [snip]
Sure. I was just hoping to have that list of descriptors pre-computed and stored in the class (like __mro__).
__mro__ gets used *all the time*; on every method lookup at least. The list of instance variable descriptors is only interesting to a small number of highly introspective tools.
I suppose the question is why even expose __slots__ if it is so worthless?
It's found in the dict when the class is defined. Why delete it? The idea is that you can make it a dict that has other info about the slots. It's got a __foo__ name. I can give it any semantics I damn well please. :-)
Exactly, and I'm telling you to have the same attitude about slots. Let me repeat something I just sent someone else about slots: It seems that unfortunately __slots__ is Python 2.2's most misunderstood feature... I see it as a hack that lets me define a special-purpose class whose instances are (almost) as efficient as I can do using C, but without having to write a C extension. (I say "almost", because a C extension can store simple values as C ints, while __slots__ only lets you store PyObject pointers. But still, it's a big savings compared to adding a __dict__ to every instance, and sometimes the slot value is picked from a small number of interned or cached ints or strings.) It has different semantics from regular attributes, and I don't try to hide that: introspection doesn't find slots the same way as it finds regular instance vars, you can't provide a default via a class variable, and there are a bunch of "don't do that" things like modifying __slots__ of an existing class or overriding a slot defined by a base class. (There's a whole list of warnings in http://www.python.org/2.2/descrintro.html!) I think as such, the feature is just right (except for the no-pickling bug). It's unfortunate that people have jumped on it as the answer to all their questions. I guess that means there's a big demand for more control over instance variables -- whether that demand is created by a real need or simply because that's how most other languages do it remains to be seen...
Note that there's something already there that you might reuse: Objects/structseq.c, which is used to create the return values of localtime(), stat() and a few others in a way that looks both like a tuple and like a read-only record. It may not be powerful enough because I think the assumption is that the set of field names is static, but you may be able to extend it or copy some good ideas. (Just don't try to understand what it does to make the tuple shorter than the record in some cases -- that's for backwards compatibility because lots of code would break if e.g. struct() returned a longer tuple than in previous Python versions, but we still want to provide new fields when using named fields. This part is not for the weak of heart, and I didn't write it, and can't guarantee that it's 100% bugfree.) [items I rejected]
- Alter vars(obj) to return a dict of all attrs
Ok, I'm a little baffled by this. Why not?
Currently, the assumption is that vars() returns a dict that can be modified to modify the underlying object's attributes. If it were to return a synthetic dict, that wouldn't work, or it would require more implementation effort than I care for -- again, since I doubt there is much demand for this outside a small set of introspection tools.
Slots share their descriptor implementation with anything defined by the tp_members array in a type object. E.g. file.softspace is a descriptor of the same type as used by slots. What they share is that they refer to "real" data stored in the instance -- either a PyObject* or some basic C type like int or double. I don't want to trust that __slots__ has the right data: even if I made it immutable, someone could still do C.__dict__['__slots__'] = <whatever>, and I don't want to go so far as to make __slots__ a property stored in the type object. So I can't really tell which descriptors are slots and which are other things -- and I don't want to, because I believe that would be breaking through an abstraction.
It's a bug to redeclare a slot. I don't find it Python's job to make it an error.
I think you're overstating the simplicity of pickling slots. There is no guarantee that the slots of a derived class are contiguous with the slots of a base class; a __weakref__ and a __dict__ field may be placed in between, and another metaclass could add other things. For example, you could write a metaclass in C that took the __slots__ idea one step further and let you declare the types of the slots as basic C types, so that other structmember keys could be used, e.g. T_INT or T_FLOAT. If you want your instances to be pickled *efficiently*, you should write a custom reduce method in C anyway -- right now, new-style classes are pickled by a piece of Python code at the end of copy_reg.py.
I had a feeling you were missing something basic. :-)
When I say SOMMCP, I really mean the "metaclass protocol" defined by the various postulates and theorems in the first few chapters of the book.
As I said, I don't have the whole set in my head, so you'll have to be more specific in your questions. (Basically, I don't expect to be adding much from the book, but I'll be looking to the book for clues as we find problems with how things are implemented now, e.g. the automatically derived metaclass issue below.)
You mean in the class object? I'm not sure what you mean by "storing the order-safety". I currently don't calculate whether there are any order conflicts: serious_order_disagreements() returns 0 without doing anything. Someone who wants it can easily implement the check from the book though.
I doubt it with the current super(Class,self).method(args) notation. Probably they will once super is a keyword so you can write super.method(args).
Strange that you mention Java in the same paragraph as an example using multiple inheritance. ;-/ Also note that this is pretty much what C++ wants you to do, except it uses '::' instead of '.' and doesn't require you to pass self (which is a different issue). I don't see this as a serious issue, just syntactic sugar.
But that's not what you'd be writing -- you'd be writing super.__init__().
Can you share them? This might be interesting.
And that we can't know.
I just meant slots, but in a sense it's also true for other ivars: if you don't know that your base class defines an ivar 'foo', you might create your own ivar named 'foo' and use it in a way that's inconsistent with the base class. Because there are no type checks and no ivar declarations, that's much harder to avoid in Python than in more static languages like C++ or Java (I assume those will complain when you redefine an ivar, even with the same type).
I'm not sure what you mean by "this is not the way that normal __dict__ attrs work", unless you are talking about overriding __init__ without calling the base class __init__ (and perhaps the same for other methods), which of course can mean that a derived class instance lacks an ivar that a base class instance would have. This is Pythonic freedom IMO. Since it's not true for regular ivars, why worry about it for slots?
I don't think we should be guided here by what might be needed by a compiler. Without actually trying to build a compiler, we'll probably miss important requirements that mean we'll have to change the language anyway, and we'll impose requirements that we think might be important without a good reason. (E.g. structured programming was once thought as an aid to compiler technology as well as to the human reader. Nowadays, optimizers reduce all control flow to labels and goto statements. :-)
Isn't that how most of PyChecker works? At least the proof-base part?
Again, you're trying to hijack slots for purposes for which they weren't created. Think of slots as an efficiency hack, *not* as a better way to declare ivars.
That's a NameError, I suppose you meant 'bar' instead of 'a' in the last two lines, then it makes sense. :-)
Once a descriptor is found, that's the end of the line. When you find a method, you call it, and it raises an exception, you're not going to continue looking for a base class method either! The descriptor type used to implement slots could do this, but doesn't. I don't care about this feature. With a __dict__, there's some real saving in not storing default values, since it means a smaller dict, which can save space. The slot space is always there, so you might as well initialize it. Concluding: don't expect that you can take an arbitrary class, analyze what ivars it uses, and add a __slots__ variable to speed it up. There are lots of differences in semantics when you use slots, and I don't want to hide those. --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (2)
-
Guido van Rossum
-
Kevin Jacobs