Safety of replacing the instance dict

Hello, this question is not directly related to PyPy but to general Python semantics. I'm posting it here since I know there are very knowledgeable people here and honestly this is the friendliest list for questions like this. I'm working on optimizing __init__s that the attrs library generates. My question is, is there anything fundamentally wrong with having an __init__ like this: def __init__(self, a, b, c): self.__dict__ = {'a': b, 'b': b, 'c': c} (basically throwing away the existing instance dict and replacing it with a fresh one). As far as I can see, this is actually faster on CPython if there is more than one attribute, and about the same speed on PyPy (benchmarking on PyPy is kinda hard, the difference is 0.2 ns vs 0.22 ns, not sure 0.02 ns is a significant difference). And this approach works for frozen classes (an attrs feature) too. It's just that doing it this way is unconventional and a little scary. Would we be violating a Python rule somewhere and making stuff blow up later if we went this way?

Hi, On 29 January 2018 at 11:22, Tin Tvrtković <tinchester@gmail.com> wrote:
No, it's semantically fine. But it comes with a heavy penalty on PyPy. I guess you don't see it because you measured something tiny, like creating the instance and then throwing it away---the JIT optimizes that to nothing at all in both cases. Not only is the creation time larger, but attribute access is slower, and the memory usage is larger. A bientôt, Armin.

sidenote: if you do the following, you can replace the __dict__ without incurring into performance penalties (Armin, please correct me if I'm wrong): import __pypy__ def __init__(self): self.__dict__ = __pypy__.newdict('instance') this is not directly useful for your use case (because newdict() always return an empty dict), but it might be useful to know in general On Mon, Jan 29, 2018 at 1:41 PM, Armin Rigo <armin.rigo@gmail.com> wrote:

Hi, On 30 January 2018 at 13:13, Antonio Cuni <anto.cuni@gmail.com> wrote:
This is a no-op, which takes time to execute but indeed doesn't seem to disable the optimization. However I don't see the point. You may as well write it like that: def __init__(self, x, y, z): d = self.__dict__ d['x'] = x d['y'] = y d['z'] = z It is a bit slower than "self.x = x; self.y = y; self.z = z" in PyPy but doesn't throw away the optimization. On CPython it is probably faster. Armin

Hi, On 29 January 2018 at 11:22, Tin Tvrtković <tinchester@gmail.com> wrote:
No, it's semantically fine. But it comes with a heavy penalty on PyPy. I guess you don't see it because you measured something tiny, like creating the instance and then throwing it away---the JIT optimizes that to nothing at all in both cases. Not only is the creation time larger, but attribute access is slower, and the memory usage is larger. A bientôt, Armin.

sidenote: if you do the following, you can replace the __dict__ without incurring into performance penalties (Armin, please correct me if I'm wrong): import __pypy__ def __init__(self): self.__dict__ = __pypy__.newdict('instance') this is not directly useful for your use case (because newdict() always return an empty dict), but it might be useful to know in general On Mon, Jan 29, 2018 at 1:41 PM, Armin Rigo <armin.rigo@gmail.com> wrote:

Hi, On 30 January 2018 at 13:13, Antonio Cuni <anto.cuni@gmail.com> wrote:
This is a no-op, which takes time to execute but indeed doesn't seem to disable the optimization. However I don't see the point. You may as well write it like that: def __init__(self, x, y, z): d = self.__dict__ d['x'] = x d['y'] = y d['z'] = z It is a bit slower than "self.x = x; self.y = y; self.z = z" in PyPy but doesn't throw away the optimization. On CPython it is probably faster. Armin
participants (3)
-
Antonio Cuni
-
Armin Rigo
-
Tin Tvrtković