
"Phillip J. Eby" <pje@telecommunity.com> writes:
At 12:39 PM 2/28/04 +0000, Michael Hudson wrote:
"Phillip J. Eby" <pje@telecommunity.com> writes:
Maybe there should instead be a tp_call_stack slot. Then the various CALL opcodes would call that slot instead of tp_call. C API calls would still go through tp_call.
People *really* should look at the patch I mentioned...
Ah, I see you've implemented this already. :)
:-)
In practice, though, I expect it would be faster to do as Jython and IronPython have done, and define a set of tp_call1, tp_call2, etc. slots that are optimized for specific calling situations, allowing C API calls to be sped up as well, provided you used things like PyObject_Call1(ob,arg), PyObject_Call2(ob,arg1,arg2), and so on.
I think this only really helps when you have a JIT compiler of some sort?
No, all you need is a mechanism similar to the one in your patch, where the creation of a method or function object includes the selecting of a function pointer for each of the call signatures. So, when creating these objects, you'd select a call0 pointer, a call1 pointer, and so on. If the function requires more arguments than provided for that slot, you just point it to an "insufficient args" version of the function. If the function has defaults that would be filled in, you point it to a version that fills in the defaults, and so on.
Well, this is psyco-style code explosion by hand :-) Doesn't sound like fun. Incidentally, I think it would be goodness if PyCFunctions exposed more info about the arguments they take.
So, there's not a JIT required, but making the patch could be tedious, repetitive and error prone. Macros could possibly help a bit with the repetition and tedium, although not necessarily the error-prone part. :)
It probably wouldn't be seriously hard to write a program writing program for this.
Perhaps there is some information that can be gleaned from the Jython research as to what are the most common number of positional parameters for calls.
That's easy: 0 then 1 then 2 then 3 then insignificant.
Actually, I'd find that a bit surprising, as I would expect 1-arg calls to be more popular than 0-argument calls. But I guess if you look at a method call, it could be a 0-argument call to the method, that maps to a 1-argument call on the function. But, I guess that would at any rate put 0-argument calls in the running.
Well, it was just a guess. Maybe METH_O is more common.
Anyway, I guess a 0-argument call for an instancemethod would look something like:
Actually, this isn't much how things work now.
PyObject * method_call0(PyObject *self) { if (self->im_self) { return PyObject_Call1(self->im_func, self->im_self); } else { return PyObject_Call0(self->im_func); } }
but, yes.
And then you'd repeat this pattern for call1 and call2, but call3 would have to fall back to creating an argument tuple if we're only going up to call3 slots.
Interestingly, the patterns involved in making these functions are basically left currying (methods) and right currying (default args). It may be that there's some way to generalize this, perhaps by using wrapping objects. On the other hand, that would tend to go against the intended speedup.
I don't think this approach is going to yield mega speedups whatever. We need Pysco or similar tech for that, I think.
ARTHUR: Don't ask me how it works or I'll start to whimper. -- The Hitch-Hikers Guide to the Galaxy, Episode 11
Heh. Did you pick this quote intentionally? :)
No, but that quote applies to a worrying fraction of my posts... Cheers, mwh -- Or here's an even simpler indicator of how much C++ sucks: Print out the C++ Public Review Document. Have someone hold it about three feet above your head and then drop it. Thus you will be enlightened. -- Thant Tessman