Add __slots__ to dataclasses to use less than half as much RAM
It's pretty wasteful to use a dynamic storage dictionary to hold the data of a "struct-like data container". Users can currently add `__slots__` manually to your `@dataclass` class, but it means you can no longer use default values, and the manual typing gets very tedious. I compared the RAM usage and benchmarked the popular attrs library vs dataclass, and saw the following result: Slots win heavily in the memory usage department, regardless of whether you use dataclass or attrs. And dataclass with manually written slots use 8 bytes less than attrs-with-slots (static number, does not change based on how many fields the class has). But dataclass loses with its lack of features, lack of default values if slots are used, and tedious way to write slots manually (see class "D"). Here are the numbers in bytes per-instance for classes: ``` attrs size 512 attrs-with-slots size 200 dataclass size 512 dataclass-with-slots size 192 ``` As for data access benchmarks: The result varied too much between runs to draw any conclusions except to say that slots was slightly faster than dictionary-based storage. And that there's no real difference between the dataclass and attrs libraries in access-speed. Here is the full benchmark code: ``` import attr from dataclasses import dataclass from pympler import asizeof import time # every additional field adds 88 bytes @attr.s class A: a = attr.ib(type=int, default=0) b = attr.ib(type=int, default=4) c = attr.ib(type=int, default=2) d = attr.ib(type=int, default=8) # every additional field adds 40 bytes @attr.s(slots=True) class B: a = attr.ib(type=int, default=0) b = attr.ib(type=int, default=4) c = attr.ib(type=int, default=2) d = attr.ib(type=int, default=8) # every additional field adds 88 bytes @dataclass class C: a: int = 0 b: int = 4 c: int = 2 d: int = 8 # every additional field adds 40 bytes @dataclass class D: __slots__ = {"a", "b", "c", "d"} a: int b: int c: int d: int Ainst = A() Binst = B() Cinst = C() Dinst = D(0,4,2,8) print("attrs size", asizeof.asizeof(Ainst)) # 512 bytes print("attrs-with-slots size", asizeof.asizeof(Binst)) # 200 bytes print("dataclass size", asizeof.asizeof(Cinst)) # 512 bytes print("dataclass-with-slots size", asizeof.asizeof(Dinst)) # 192 bytes s = time.perf_counter() for i in range(0,250000000): x = Ainst.a elapsed = time.perf_counter() - s print("elapsed attrs:", (elapsed*1000), "milliseconds") s = time.perf_counter() for i in range(0,250000000): x = Binst.a elapsed = time.perf_counter() - s print("elapsed attrs-with-slots:", (elapsed*1000), "milliseconds") s = time.perf_counter() for i in range(0,250000000): x = Cinst.a elapsed = time.perf_counter() - s print("elapsed dataclass:", (elapsed*1000), "milliseconds") s = time.perf_counter() for i in range(0,250000000): x = Dinst.a elapsed = time.perf_counter() - s print("elapsed dataclass-with-slots:", (elapsed*1000), "milliseconds") ``` Also note that it IS possible to annotate attrs-classes using the PEP 526 annotation (ie. `a: int = 0` instead of `a = attr.ib(type=int, default=0)`, but then you lose out on a bunch of its extra features that are also specified as named parameters to attr.ib (such as validators, kw_only parameters, etc). Anyway, the gist of everything is: Slots heavily beat dictionaries, reducing the RAM usage to less than half of the current dataclass implementation. My proposal: Implement `@dataclass(slots=True)` which does the same thing as attrs: Replaces the class with a modified class that has a `__slots__` property instead of a `__dict__`. And fully supporting default values in the process.
On Friday, September 27, 2019, 07:47:41 AM PDT, Johnny Dahlberg <svartchimpans@gmail.com> wrote:
My proposal: Implement `@dataclass(slots=True)` which does the same thing as attrs: Replaces the class with a modified class that has a `__slots__` property instead of a `__dict__`. And fully supporting default values in the process.
I don't think anyone would be against this in principle; the question is implementing it, and bikeshedding. For example, if nobody's come up with a better implementation than Eric's original one, should we add a @dataclass_slots decorator, or a @slotsify that you put around @dataclass, or get rid of the guarantee that @dataclass returns your class with extra dunders. IIRC, the consensus after the discussion at the time was that this was a feature that could be added later, after a bit more experience in the field (and maybe someone will come up with a better implementation by then), so it was just deferred to the future rather than rejected. Meanwhile, the @slotsify or @dataclass_slots should be writable as a PyPI package. Has anyone done that? If there's a popular and stable implementation, that's a good argument for merging it into the stdlib (and leaving it on PyPI as a backport). If it turned out to be tricky to implement without modifying dataclass itself, explaining why could also be a good argument for moving it into the stdlib.
On Sep 27, 2019, at 12:52 PM, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
On Friday, September 27, 2019, 07:47:41 AM PDT, Johnny Dahlberg <svartchimpans@gmail.com> wrote:
My proposal: Implement `@dataclass(slots=True)` which does the same thing as attrs: Replaces the class with a modified class that has a `__slots__` property instead of a `__dict__`. And fully supporting default values in the process.
I don't think anyone would be against this in principle; the question is implementing it, and bikeshedding. For example, if nobody's come up with a better implementation than Eric's original one, should we add a @dataclass_slots decorator, or a @slotsify that you put around @dataclass, or get rid of the guarantee that @dataclass returns your class with extra dunders.
IIRC, the consensus after the discussion at the time was that this was a feature that could be added later, after a bit more experience in the field (and maybe someone will come up with a better implementation by then), so it was just deferred to the future rather than rejected.
Meanwhile, the @slotsify or @dataclass_slots should be writable as a PyPI package. Has anyone done that? If there's a popular and stable implementation, that's a good argument for merging it into the stdlib (and leaving it on PyPI as a backport). If it turned out to be tricky to implement without modifying dataclass itself, explaining why could also be a good argument for moving it into the stdlib.
I keep threatening to add a more-dataclasses (or similar) to PyPI. Maybe I’ll actually get around to it now and add this decorator. Eric
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OQGEYZ... Code of Conduct: http://python.org/psf/codeofconduct/
27.09.19 19:48, Andrew Barnert via Python-ideas пише:
On Friday, September 27, 2019, 07:47:41 AM PDT, Johnny Dahlberg <svartchimpans@gmail.com> wrote:
My proposal: Implement `@dataclass(slots=True)` which does the same thing as attrs: Replaces the class with a modified class that has a `__slots__` property instead of a `__dict__`. And fully supporting default values in the process.
I don't think anyone would be against this in principle; the question is implementing it, and bikeshedding. For example, if nobody's come up with a better implementation than Eric's original one, should we add a @dataclass_slots decorator, or a @slotsify that you put around @dataclass, or get rid of the guarantee that @dataclass returns your class with extra dunders.
IIRC, the consensus after the discussion at the time was that this was a feature that could be added later, after a bit more experience in the field (and maybe someone will come up with a better implementation by then), so it was just deferred to the future rather than rejected.
Meanwhile, the @slotsify or @dataclass_slots should be writable as a PyPI package. Has anyone done that? If there's a popular and stable implementation, that's a good argument for merging it into the stdlib (and leaving it on PyPI as a backport). If it turned out to be tricky to implement without modifying dataclass itself, explaining why could also be a good argument for moving it into the stdlib.
I think it needs an explicit support in the type creation machinery (as __slots__ itself has) to support descriptors and slots with the same name. It would be also useful for other applications, like cached_property.
On Fri, Sep 27, 2019 at 11:18 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
I think it needs an explicit support in the type creation machinery (as __slots__ itself has) to support descriptors and slots with the same name.
That would be tough, since __slots__ is currently implemented by creating a separate descriptor for each slot. I do think that it would be nice to have a way to automatically create __slots__ from annotations. It would be even nicer if that could be done without copying the class object (as the current state of the art requires: https://github.com/ericvsmith/dataclasses/blob/master/dataclass_tools.py#L23 ). Thinking aloud, perhaps this could be done by setting __slots__ to a magical value, e.g. class Point: __slots__ = "__auto__" x: float y: float This would be independent from the @dataclass decorator (though the decorator may have to be aware of the magic value). If that's too wacky, we could also use a class keyword argument: class Point(slots=True): x: float y: float (Though arguably that's just as wacky. :-) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Fri, Sep 27, 2019 at 5:21 PM Guido van Rossum <guido@python.org> wrote:
On Fri, Sep 27, 2019 at 11:18 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
I think it needs an explicit support in the type creation machinery (as __slots__ itself has) to support descriptors and slots with the same name.
That would be tough, since __slots__ is currently implemented by creating a separate descriptor for each slot.
I do think that it would be nice to have a way to automatically create __slots__ from annotations. It would be even nicer if that could be done without copying the class object (as the current state of the art requires: https://github.com/ericvsmith/dataclasses/blob/master/dataclass_tools.py#L23 ).
Thinking aloud, perhaps this could be done by setting __slots__ to a magical value, e.g.
class Point: __slots__ = "__auto__" x: float y: float
This would be independent from the @dataclass decorator (though the decorator may have to be aware of the magic value).
If that's too wacky, we could also use a class keyword argument:
class Point(slots=True): x: float y: float
(Though arguably that's just as wacky. :-)
An argument is less wacky than assigning a magic value to a magic dunder. -gps
On Sep 27, 2019, at 17:20, Guido van Rossum <guido@python.org> wrote:
Thinking aloud, perhaps this could be done by setting __slots__ to a magical value, e.g.
class Point: __slots__ = "__auto__" x: float y: float
Would this confuse any existing automated tools (or dataclass-like libraries) into thinking you’re declaring slots named _, a, u, t, and o? (I don’t think there’s any danger of confusing human readers, at least.)
Here's an idea I was toying with in thinking about the problem this evening. Currently, python complains if you try to add a class member that will conflict with a slot:
class C: ... __slots__="x" ... x=1 ... Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: 'x' in __slots__ conflicts with class variable
What if the slots machinery were changed so that there was a warning propagated instead, and the conflicting member value(s) were saved in some appropriate place in the class namespace? Maybe something like __slot_conflicts__, or something like that.
class C: ... __slots__="x" ... x=1 ... C.__slot_conflicts__["x"] 1
This would give the decorator the opportunity to find the stuff that was put in those fields, and sort out the class definition in an expected way. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler On Fri, Sep 27, 2019 at 9:39 PM Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
On Sep 27, 2019, at 17:20, Guido van Rossum <guido@python.org> wrote:
Thinking aloud, perhaps this could be done by setting __slots__ to a
magical value, e.g.
class Point: __slots__ = "__auto__" x: float y: float
Would this confuse any existing automated tools (or dataclass-like libraries) into thinking you’re declaring slots named _, a, u, t, and o?
(I don’t think there’s any danger of confusing human readers, at least.)
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TPCZH4... Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Sep 27, 2019 at 6:41 PM Ricky Teachey <ricky@teachey.org> wrote:
Here's an idea I was toying with in thinking about the problem this evening.
Currently, python complains if you try to add a class member that will conflict with a slot:
class C: ... __slots__="x" ... x=1 ... Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: 'x' in __slots__ conflicts with class variable
What if the slots machinery were changed so that there was a warning propagated instead, and the conflicting member value(s) were saved in some appropriate place in the class namespace? Maybe something like __slot_conflicts__, or something like that.
class C: ... __slots__="x" ... x=1 ... C.__slot_conflicts__["x"] 1
This would give the decorator the opportunity to find the stuff that was put in those fields, and sort out the class definition in an expected way.
But it would silently do nothing if there was no class decorator to look at it. However if we did `__slots__ = "__auto__"` then we might finagle it so that the initializers could be preserved: class C: __slots__ = "__auto__" # or MRAB's suggested __slots__ = ... x: float = 0.0 y: float = 0.0 I'm not sure what to do with unannotated initializers -- I worry that it's a slippery slope, since we don't want methods to be accidentally slotified. But decorated methods may have a variety of types (e.g. class methods, static methods, properties) and user-defined decorators could do other things. So I'd rather keep the rule for auto-slots simple: if there's an annotation (and it's not `ClassVar[something]`) then it's a slot, otherwise it's a class variable that can't be shadowed by an instance variable. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
But it would silently do nothing if there was no class decorator to look at it.
This doesn't seem like such a big deal to me. It would simply be the way slots works. However if we did `__slots__ = "__auto__"` then we might finagle it so that
the initializers could be preserved:
class C: __slots__ = "__auto__" # or MRAB's suggested __slots__ = ... x: float = 0.0 y: float = 0.0
I do like this idea a lot. Saving, or preserving, the initialized values somewhere is mainly what I was driving at with the so called __slots_conflicts__ idea. Perhaps instead the slots descriptor object could store the value? Something like: class C: __slots__ = ... x: float = 0.0 y: float = 0.0 assert C.x.__value__ == 0.0 assert C.y.__value__ == 0.0 If that becomes the way this is solved, should slots be made to also accept a dict to create starting values? class C: __slots__ = dict(x = 0.0, y = 0.0) assert C.x.__value__ == 0.0 assert C.y.__value__ == 0.0
On Sat, Sep 28, 2019 at 2:26 PM Ricky Teachey <ricky@teachey.org> wrote:
But it would silently do nothing if there was no class decorator to look
at it.
This doesn't seem like such a big deal to me. It would simply be the way slots works.
I was thinking it's currently an error. And you chose a name (__slot_conflics__) suggesting that there still was a problem. But yeah, maybe other than that it's no big deal.
However if we did `__slots__ = "__auto__"` then we might finagle it so
that the initializers could be preserved:
class C: __slots__ = "__auto__" # or MRAB's suggested __slots__ = ... x: float = 0.0 y: float = 0.0
I do like this idea a lot.
Saving, or preserving, the initialized values somewhere is mainly what I was driving at with the so called __slots_conflicts__ idea. Perhaps instead the slots descriptor object could store the value? Something like:
class C: __slots__ = ... x: float = 0.0 y: float = 0.0
assert C.x.__value__ == 0.0 assert C.y.__value__ == 0.0
Why not -- they have to go somewhere. :-/
If that becomes the way this is solved, should slots be made to also accept a dict to create starting values?
class C: __slots__ = dict(x = 0.0, y = 0.0)
assert C.x.__value__ == 0.0 assert C.y.__value__ == 0.0
But what would be the use case for that? When would you ever prefer to write this rather than the first version? I guess if the slots are computed -- but who uses computed slots nowadays?
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
I was thinking it's currently an error. And you chose a name (__slot_conflics__) suggesting that there still was a problem. But yeah, maybe other than that it's no big deal.
It's a good point; the name does seem to say "here's a problem that needs to be solved". Your comment makes me think all the more that attaching them to the descriptor seems better, since it doesn't say "fix this". If that becomes the way this is solved, should slots be made to also accept
a dict to create starting values?
class C: __slots__ = dict(x = 0.0, y = 0.0)
assert C.x.__value__ == 0.0 assert C.y.__value__ == 0.0
But what would be the use case for that? When would you ever prefer to write this rather than the first version? I guess if the slots are computed -- but who uses computed slots nowadays?
I guess can't think of a compelling one. Maybe if you are one of those people who really doesn't like the typing stuff. It just seemed like an obvious way for the language to behave if things were modified such that these initialized values get stored somewhere for later use. However it just occurred to me: if it worked that way, how could you signal that the slot has no value? class C: __slots__ = dict(x = 0.0, y = None, z = <NO VALUE>)
On Sep 27, 2019, at 8:23 PM, Guido van Rossum <guido@python.org> wrote:
On Fri, Sep 27, 2019 at 11:18 AM Serhiy Storchaka <storchaka@gmail.com> wrote: I think it needs an explicit support in the type creation machinery (as __slots__ itself has) to support descriptors and slots with the same name.
That would be tough, since __slots__ is currently implemented by creating a separate descriptor for each slot.
I do think that it would be nice to have a way to automatically create __slots__ from annotations. It would be even nicer if that could be done without copying the class object (as the current state of the art requires: https://github.com/ericvsmith/dataclasses/blob/master/dataclass_tools.py#L23).
Thinking aloud, perhaps this could be done by setting __slots__ to a magical value, e.g.
class Point: __slots__ = "__auto__" x: float y: float
I think a sentinel like None or a new typing.use_annotations_for_slots (need a better name, of course) would be better than a magic string, especially since strings are iterable.
This would be independent from the @dataclass decorator (though the decorator may have to be aware of the magic value).
If that's too wacky, we could also use a class keyword argument:
class Point(slots=True): x: float y: float
(Though arguably that's just as wacky. :-)
I like a special value for __slots__. I might look in to the feasibility of this. Eric
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?) _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3632QS... Code of Conduct: http://python.org/psf/codeofconduct/
On 2019-09-28 03:12, Eric V. Smith wrote:
On Sep 27, 2019, at 8:23 PM, Guido van Rossum <guido@python.org> wrote:
On Fri, Sep 27, 2019 at 11:18 AM Serhiy Storchaka <storchaka@gmail.com <mailto:storchaka@gmail.com>> wrote:
I think it needs an explicit support in the type creation machinery (as __slots__ itself has) to support descriptors and slots with the same name.
That would be tough, since __slots__ is currently implemented by creating a separate descriptor for each slot.
I do think that it would be nice to have a way to automatically create __slots__ from annotations. It would be even nicer if that could be done without copying the class object (as the current state of the art requires: https://github.com/ericvsmith/dataclasses/blob/master/dataclass_tools.py#L23).
Thinking aloud, perhaps this could be done by setting __slots__ to a magical value, e.g.
class Point: __slots__ = "__auto__" x: float y: float
I think a sentinel like None or a new typing.use_annotations_for_slots (need a better name, of course) would be better than a magic string, especially since strings are iterable.
[snip] I was also thinking about suggesting None, but I was wondering whether that could be misleading because it reads like "no slots". What about "..." instead? Would that read better? class Point: __slots__ = .... x: float y: float
On Fri, Sep 27, 2019 at 8:01 PM MRAB <python@mrabarnett.plus.com> wrote:
I was also thinking about suggesting None, but I was wondering whether that could be misleading because it reads like "no slots".
What about "..." instead? Would that read better?
class Point: __slots__ = .... x: float y: float
Not bad! -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
I love this idea. It makes `__slots__` easy and efficient to define! The current, explicit syntax is very tedious and verbose. `__slots__ = ...` is a very clear syntax for saying "Include ALL annotated class-variables automatically". It would relegate manual `__slots__ = { "a", "b" }` usage to when you want to define slots without annotating the variables, which pretty much nobody does. So in effect, this new syntax would mean that everyone who wants slots in their classes will use the easy triple-dot syntax! It would be a very nice language improvement. (The `@dataclass` system itself still needs a way to allow "default variable values" (in the auto-generated `__init__`) if slots are used, though...) Den lör 28 sep. 2019 kl 05:12 skrev Guido van Rossum <guido@python.org>:
On Fri, Sep 27, 2019 at 8:01 PM MRAB <python@mrabarnett.plus.com> wrote:
I was also thinking about suggesting None, but I was wondering whether that could be misleading because it reads like "no slots".
What about "..." instead? Would that read better?
class Point: __slots__ = .... x: float y: float
Not bad!
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RJQG2E... Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Sep 27, 2019 at 7:12 PM Eric V. Smith <eric@trueblade.com> wrote:
On Sep 27, 2019, at 8:23 PM, Guido van Rossum <guido@python.org> wrote:
On Fri, Sep 27, 2019 at 11:18 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
I think it needs an explicit support in the type creation machinery (as __slots__ itself has) to support descriptors and slots with the same name.
That would be tough, since __slots__ is currently implemented by creating a separate descriptor for each slot.
I do think that it would be nice to have a way to automatically create __slots__ from annotations. It would be even nicer if that could be done without copying the class object (as the current state of the art requires: https://github.com/ericvsmith/dataclasses/blob/master/dataclass_tools.py#L23 ).
Thinking aloud, perhaps this could be done by setting __slots__ to a magical value, e.g.
class Point: __slots__ = "__auto__" x: float y: float
I think a sentinel like None or a new typing.use_annotations_for_slots (need a better name, of course) would be better than a magic string, especially since strings are iterable.
If `__slots__` is a string, it will be considered the name of the single slot to be created. But we have the namespace of __dunder__ names reserved, so `__slots__ = "__auto__"` should be okay. (But `__slots__ = "auto"` would not be.) I'm not keen on `None`, that looks like it would mean "no slots" (analogous to `__hash__ = None`).
This would be independent from the @dataclass decorator (though the decorator may have to be aware of the magic value).
If that's too wacky, we could also use a class keyword argument:
class Point(slots=True): x: float y: float
(Though arguably that's just as wacky. :-)
I like a special value for __slots__. I might look in to the feasibility of this.
Please do. It sounds simpler than the `(slots=True)` version -- IIRC keyword args to classes are hard to work with. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Fri, Sep 27, 2019 at 11:18 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
I think it needs an explicit support in the type creation machinery (as
__slots__ itself has) to support descriptors and slots with the same name.
That would be tough, since __slots__ is currently implemented by creating a separate descriptor for each slot.
One thought I've had about __slots__ would be it'd be nice to take a dictionary in the form of: class C: __slots__ = {'a': ???, 'b': ???} You could actually provide this dictionary today, but the values would be ignored. The values could start to do interesting things. One flavor of that would be that they could indicate the underlying storage used for the slots (maybe with 'i' for int32, 'b' for byte, 'l' for long, or whatever color encoding sounds good). This is just mapping into the available storage types that are already available in structmember.c. That's just extending the existing use case of slots as being a more memory efficient storage representation, and might help people avoid dropping into Cython just to get compact instance members. But another application of that could be accepting a callable which would then receive the descriptor, and return a new descriptor. One example of what that'd let you do is build a cached-property decorator that would do the get/sets into the slot. But presumably it would also provide a way for other scenarios where you want to explicitly collide with the get/set descriptor with a member. It doesn't help so much with the verbosity of defining these things that I think was mentioned elsewhere in this thread. And it doesn't play so well w/ class decorators, but could be more usable with meta-classes. On Fri, Sep 27, 2019 at 5:21 PM Guido van Rossum <guido@python.org> wrote:
On Fri, Sep 27, 2019 at 11:18 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
I think it needs an explicit support in the type creation machinery (as __slots__ itself has) to support descriptors and slots with the same name.
That would be tough, since __slots__ is currently implemented by creating a separate descriptor for each slot.
I do think that it would be nice to have a way to automatically create __slots__ from annotations. It would be even nicer if that could be done without copying the class object (as the current state of the art requires: https://github.com/ericvsmith/dataclasses/blob/master/dataclass_tools.py#L23 ).
Thinking aloud, perhaps this could be done by setting __slots__ to a magical value, e.g.
class Point: __slots__ = "__auto__" x: float y: float
This would be independent from the @dataclass decorator (though the decorator may have to be aware of the magic value).
If that's too wacky, we could also use a class keyword argument:
class Point(slots=True): x: float y: float
(Though arguably that's just as wacky. :-)
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3632QS... Code of Conduct: http://python.org/psf/codeofconduct/
On Sat, Sep 28, 2019 at 5:04 PM Dino Viehland <dinoviehland@gmail.com> wrote:
One thought I've had about __slots__ would be it'd be nice to take a dictionary in the form of:
class C: __slots__ = {'a': ???, 'b': ???}
You could actually provide this dictionary today, but the values would be ignored. The values could start to do interesting things. One flavor of that would be that they could indicate the underlying storage used for the slots (maybe with 'i' for int32, 'b' for byte, 'l' for long, or whatever color encoding sounds good). This is just mapping into the available storage types that are already available in structmember.c. That's just extending the existing use case of slots as being a more memory efficient storage representation, and might help people avoid dropping into Cython just to get compact instance members.
Hm... But then you'd be paying for boxing/unboxing cost on each access. I'm actually okay with needing to use Cython if you're really that tight for space.
But another application of that could be accepting a callable which would then receive the descriptor, and return a new descriptor. One example of what that'd let you do is build a cached-property decorator that would do the get/sets into the slot. But presumably it would also provide a way for other scenarios where you want to explicitly collide with the get/set descriptor with a member. It doesn't help so much with the verbosity of defining these things that I think was mentioned elsewhere in this thread. And it doesn't play so well w/ class decorators, but could be more usable with meta-classes.
But you could do that without the wacky API by just naming the slots _foo, _bar and have properties foo, bar. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Hm... But then you'd be paying for boxing/unboxing cost on each access. I'm actually okay with needing to use Cython if you're really that tight for space.
Right, but those would be rather ephemeral vs more potentially long lived members. It'll certainly depend upon the usage pattern whether or not it's worth it. And the point of using __slots__ is that you've decided you're tight on space but we don't force usage of Cython to go dict-free. But you could do that without the wacky API by just naming the slots _foo,
_bar and have properties foo, bar.
You can also just mutate the type after the fact and replace the descriptor with the wrapped descriptor. Wouldn't the prefixed solution work just as well for dataclasses in some form as well? On Sat, Sep 28, 2019 at 5:16 PM Guido van Rossum <guido@python.org> wrote:
On Sat, Sep 28, 2019 at 5:04 PM Dino Viehland <dinoviehland@gmail.com> wrote:
One thought I've had about __slots__ would be it'd be nice to take a dictionary in the form of:
class C: __slots__ = {'a': ???, 'b': ???}
You could actually provide this dictionary today, but the values would be ignored. The values could start to do interesting things. One flavor of that would be that they could indicate the underlying storage used for the slots (maybe with 'i' for int32, 'b' for byte, 'l' for long, or whatever color encoding sounds good). This is just mapping into the available storage types that are already available in structmember.c. That's just extending the existing use case of slots as being a more memory efficient storage representation, and might help people avoid dropping into Cython just to get compact instance members.
Hm... But then you'd be paying for boxing/unboxing cost on each access. I'm actually okay with needing to use Cython if you're really that tight for space.
But another application of that could be accepting a callable which would then receive the descriptor, and return a new descriptor. One example of what that'd let you do is build a cached-property decorator that would do the get/sets into the slot. But presumably it would also provide a way for other scenarios where you want to explicitly collide with the get/set descriptor with a member. It doesn't help so much with the verbosity of defining these things that I think was mentioned elsewhere in this thread. And it doesn't play so well w/ class decorators, but could be more usable with meta-classes.
But you could do that without the wacky API by just naming the slots _foo, _bar and have properties foo, bar.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Sat, Sep 28, 2019 at 5:56 PM Dino Viehland <dinoviehland@gmail.com> wrote:
Hm... But then you'd be paying for boxing/unboxing cost on each access.
I'm actually okay with needing to use Cython if you're really that tight for space.
Right, but those would be rather ephemeral vs more potentially long lived members. It'll certainly depend upon the usage pattern whether or not it's worth it. And the point of using __slots__ is that you've decided you're tight on space but we don't force usage of Cython to go dict-free.
Well, I expect the implementation to be too hairy to bother.
But you could do that without the wacky API by just naming the slots _foo,
_bar and have properties foo, bar.
You can also just mutate the type after the fact and replace the descriptor with the wrapped descriptor. Wouldn't the prefixed solution work just as well for dataclasses in some form as well?
SOrry, I lost track of what you call "prefixed". -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Fri, Sep 27, 2019, at 12:48, Andrew Barnert via Python-ideas wrote:
or get rid of the guarantee that @dataclass returns your class with extra dunders.
Why is dataclass a decorator instead of a metaclass (or, as below, pseudo-metaclass) anyway? Is it just that the decorator syntax looks nicer? If it were a metaclass, it could add __slots__ before constructing the class. def slot_dataclass(name, bases, dct): dct['__slots__'] = dct['__annotations__'].keys() return dataclasses.dataclass(type(name, bases, dct)) If the problem is that the metaclass syntax is ugly, maybe we need a nicer-looking metaclass syntax, e.g. "class MyClass(bases) as slot_dataclass:"
Why is dataclass a decorator instead of a metaclass (or, as below, pseudo-metaclass) anyway?
One reason: because a class can only have one metaclass. So if dataclass were a metaclass, it would not be possible to create a dataclass using an existing metaclass multiple inheritance... which, ain't nobody got time for THAT.
On 9/27/2019 6:11 PM, Random832 wrote:
On Fri, Sep 27, 2019, at 12:48, Andrew Barnert via Python-ideas wrote:
or get rid of the guarantee that @dataclass returns your class with extra dunders. Why is dataclass a decorator instead of a metaclass (or, as below, pseudo-metaclass) anyway? Is it just that the decorator syntax looks nicer?
It's in the PEP 557 Rationale: so that the class can use metaclasses any way it wants, without interference from dataclasses. Eric
participants (10)
-
Andrew Barnert
-
Dino Viehland
-
Eric V. Smith
-
Gregory P. Smith
-
Guido van Rossum
-
Johnny Dahlberg
-
MRAB
-
Random832
-
Ricky Teachey
-
Serhiy Storchaka