
I have a dictionary (in RPython) that looks something like this: class MyDict(Obj): def add(self, k, v): newdict = self.dict.copy() newdict[k] = v self.dict = newdict def get(self, k): d = self.dict return MyDict._static_get(d, k) @staticmethod @purefunction def _static_get(d, k): if d not in k: return None return d[k] I'm trying to figure out the best way to optimize this. As you see, this structure is "append only". That is if mydict.get("foo") returns a value it will always return that value, for all time. In that sense, the function is pure. If however, the function returns None, then it could change in the future. This is for my Clojure on PyPy implementation. The idea is that types can be extended via protocols, at runtime, at any point in the program's execution. However, once a given type is extended to support a given interface (or protocol) it will never be able to be changed. That is, once we extend Foo so that it implements Foo.count() it will implement Foo.count() for all time. Any thoughts on how to optimize this further for the jit? I'd like to promote the value of get() to a constant, but I can only do that as long as it is not None. After Foo is extended, I'd like the jit to re-generate the jitted code to remove all guards, hopefully saving a few cycles. Timothy -- “One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.” (Robert Firth)

On Tue, Nov 29, 2011 at 3:04 PM, Timothy Baldridge <tbaldridge@gmail.com>wrote:
Sounds like the idea you're looking for is basically out-of-line guards. Basically the idea of these is you have field which rarely changes, and when it does you should regenerate all code. So you would have an object with the dict and a signature, and you could mutate the dict, but whenever you did, you'd need to update the signature. PyPy's module-dictionary implementation shows an example of this: https://bitbucket.org/pypy/pypy/src/default/pypy/objspace/std/celldict.py Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero

Hi, On Tue, Nov 29, 2011 at 21:28, Romain Guillebert <romain.py@gmail.com> wrote:
Probably because he (as a clojure developer) likes immutability of data structures.
No, it's really needed for the way it is written: by creating a new dict, the old purefunction results no longer apply. But we are (indeed) using a slightly different approach in PyPy by not copying the dict, but instead creating a new empty 'signature' object that we pass to the purefunction too. We don't have anything exactly similar in PyPy, I think. I would go for something along the lines of: class Cell(object): _immutable_fields_ = ['content?'] # quasi-immutable content = None _all_cells = {} @elidable # same as @purefunction def get_cell(key): return _all_cells.setdefault(key, Cell()) or, depending on the usage, maybe @elidable_promote, if the key should always be a jit-constant. In this way the user gets a Cell corresponding to the key he asked for, and then he can read the 'content' field, which is initially None but may be set to something else. Because it is a quasi-immutable field, this is all done with no machine code produced. If later the same Cell has its 'content' field modified, then the old machine code is discarded and new code is produced. Note the indirection: the JIT should not see the @elidable code, but just call it at tracing time; but the JIT must see the read of the 'content' field, to be able to use the fact that it's a quasi-immutable. A bientôt, Armin.

On Tue, Nov 29, 2011 at 3:04 PM, Timothy Baldridge <tbaldridge@gmail.com>wrote:
Sounds like the idea you're looking for is basically out-of-line guards. Basically the idea of these is you have field which rarely changes, and when it does you should regenerate all code. So you would have an object with the dict and a signature, and you could mutate the dict, but whenever you did, you'd need to update the signature. PyPy's module-dictionary implementation shows an example of this: https://bitbucket.org/pypy/pypy/src/default/pypy/objspace/std/celldict.py Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero

Hi, On Tue, Nov 29, 2011 at 21:28, Romain Guillebert <romain.py@gmail.com> wrote:
Probably because he (as a clojure developer) likes immutability of data structures.
No, it's really needed for the way it is written: by creating a new dict, the old purefunction results no longer apply. But we are (indeed) using a slightly different approach in PyPy by not copying the dict, but instead creating a new empty 'signature' object that we pass to the purefunction too. We don't have anything exactly similar in PyPy, I think. I would go for something along the lines of: class Cell(object): _immutable_fields_ = ['content?'] # quasi-immutable content = None _all_cells = {} @elidable # same as @purefunction def get_cell(key): return _all_cells.setdefault(key, Cell()) or, depending on the usage, maybe @elidable_promote, if the key should always be a jit-constant. In this way the user gets a Cell corresponding to the key he asked for, and then he can read the 'content' field, which is initially None but may be set to something else. Because it is a quasi-immutable field, this is all done with no machine code produced. If later the same Cell has its 'content' field modified, then the old machine code is discarded and new code is produced. Note the indirection: the JIT should not see the @elidable code, but just call it at tracing time; but the JIT must see the read of the 'content' field, to be able to use the fact that it's a quasi-immutable. A bientôt, Armin.
participants (5)
-
Alex Gaynor
-
Armin Rigo
-
Maciej Fijalkowski
-
Romain Guillebert
-
Timothy Baldridge