<div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Sun, 19 Jun 2016 at 19:37 Guido van Rossum <<a href="mailto:guido@python.org">guido@python.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Sun, Jun 19, 2016 at 6:29 PM, Brett Cannon <span dir="ltr"><<a href="mailto:brett@python.org" target="_blank">brett@python.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><br><br><div class="gmail_quote"><span><div dir="ltr">On Sat, 18 Jun 2016 at 21:49 Guido van Rossum <<a href="mailto:guido@python.org" target="_blank">guido@python.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><div><div><div>Hi Brett,<br><br></div>I've got a few questions about the specific design. Probably you know the answers, it would be nice to have them in the PEP.<br></div></div></div></div></div></blockquote><div><br></div></span><div>Once you're happy with my answers I'll update the PEP.</div></div></div></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Soon!<br> <br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><div><div><br></div>First, why not have a global hook? What does a hook per interpreter give you? Would even finer granularity buy anything?<br></div></div></div></div></blockquote><div><br></div></span><div>We initially considered a per-code object hook, but we figured it was unnecessary to have that level of control, especially since people like Numba have gotten away with not needing it for this long (although I suspect that's because they are a decorator so they can just return an object that overrides __call__()).</div></div></div></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>So they do it at the function object level? <br></div></div></div></div></blockquote><div><br></div><div>Yes. They use a decorator, allowing them to completely control what function object gets returned.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div>We didn't think that a global one was appropriate as different workloads may call for different JITs/debuggers/etc. and there is no guarantee that you are executing every interpreter with the same workload. Plus we figured people might simply import their JIT of choice and as a side-effect set the hook, and since imports are a per-interpreter thing that seemed to suggest the granularity of interpreters.</div></div></div></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>I like import as the argument here.<br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div><br></div><div>IOW it seemed to be more in line with sys.settrace() than some global thing for the process.</div><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><div><br></div>Next, I'm a bit (but no more than a bit) concerned about the extra 8 bytes per code object, especially since for most people this is just waste (assuming most people won't be using Pyjion or Numba). Could it be a compile-time feature (requiring recompilation of CPython but not extensions)?</div></div></div></blockquote><div><br></div></span><div>Probably. It does water down potential usage thanks to needing a special build. If the decision is "special build or not", I would simply pull out this part of the proposal as I wouldn't want to add a flag that influences what is or is not possible for an interpreter.</div></div></div></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>MRAB's response made me think of a possible approach: the co_extra field could be the very last field of the PyCodeObject struct and only present if a certain flag is set in co_flags. This is similar to a trick used by X11 (I know, it's long ago :-).<br></div></div></div></div></blockquote><div><br></div><div>But that doesn't resolve your memory worry, right? For a JIT you will have to access the memory regardless for execution count (unless Yury's patch to add caching goes in, in which case it will be provided by code objects already).<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div> Could you figure out some other way to store per-code-object data? It seems you considered this but decided that the co_extra field was simpler and faster; I'm basically pushing a little harder on this. Of course most of the PEP would disappear without this feature; the extra interpreter field is fine.<br></div></div></div></blockquote><div><br></div></span><div>Dino and I thought of two potential alternatives, neither of which we have taken the time to implement and benchmark. One is to simply have a hash table of memory addresses to JIT data that is kept on the JIT side of things. Obviously it would be nice to avoid the overhead of a hash table lookup on every function call. This also doesn't help minimize memory when the code object gets GC'ed.</div></div></div></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>I guess the prospect of the extra hash lookup per call isn't great given that this is about perf... <br></div></div></div></div></blockquote><div><br></div><div>It's not desirable, but it isn't the end of the world either. I think Dino doesn't believe it will be that big of a deal to switch to a hash table.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div><br></div><div>The other potential solution we came up with was to use weakrefs. I have not looked into the details, but we were thinking that if we registered the JIT data object as a weakref on the code object, couldn't we iterate through the weakrefs attached to the code object to look for the JIT data object, and then get the reference that way? It would let us avoid a more expensive hash table lookup if we assume most code objects won't have a weakref on it (assuming weakrefs are stored in a list), and it gives us the proper cleanup semantics we want by getting the weakref cleanup callback execution to make sure we decref the JIT data object appropriately. But as I said, I have not looked into the feasibility of this at all to know if I'm remembering the weakref implementation details correctly.</div></div></div></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>That would be even slower than the hash table lookup, and unbounded. So let's not go there.</div></div></div></div></blockquote><div><br></div><div>OK.<br><br></div><div>-Brett <br></div></div></div>