Special-casing the snot out of "O" looks like a winner <wink>:
I have a patch on SF that takes this approach: http://sourceforge.net/tracker/index.php?func=detail&aid=427190&group_id=5470&atid=305470 The idea is that functions can be declared as METH_O, instead of METH_VARARGS. I also offer METH_l, but this is currently not used. The approach could be extended to other signatures, e.g. METH_O_opt_O (i.e. "O|O"). Some signatures cannot be changed into special-calls, e.g. "O!", or "ll|l". In the PyXML test suite, "O" is indeed the most frequent case (72%), and it is primarily triggered through len (26%), append (24%), and ord (6%). These are the only functions that make use of the new calling conventions at the moment. If you look at the patch, you'll see that it is quite easy to change a method to use a different calling convention (basically just remove the PyArg_ParseTuple call). To measure the patch, I use the script from time import clock indices = [1] * 20000 indices1 = indices*100 r1 = [1]*60 def doit(case): s = clock() i = 0 if case == 0: f = ord for i in indices1: f("o") elif case == 1: for i in indices: l = [] f = l.append for i in r1: f(i) elif case == 2: f = len for i in indices1: f("o") f = clock() return f - s for i in xrange(10): print "%.3f %.3f %.3f" % (doit(0),doit(1),doit(2)) Without the patch, (almost) stock CVS gives 2.190 1.800 2.240 2.200 1.800 2.220 2.200 1.800 2.230 2.220 1.800 2.220 2.200 1.800 2.220 2.200 1.790 2.240 2.200 1.790 2.230 2.200 1.800 2.220 2.200 1.800 2.240 2.200 1.790 2.230 With the patch, I get 1.440 1.330 1.460 1.420 1.350 1.440 1.430 1.340 1.430 1.510 1.350 1.460 1.440 1.360 1.470 1.460 1.330 1.450 1.430 1.330 1.420 1.440 1.340 1.440 1.430 1.340 1.430 1.410 1.340 1.450 So the speed-up is roughly 30% to 50%, depending on how much work the function has to do. Please let me know what you think. Regards, Martin
"Martin v. Loewis" wrote:
Special-casing the snot out of "O" looks like a winner <wink>:
I have a patch on SF that takes this approach:
http://sourceforge.net/tracker/index.php?func=detail&aid=427190&group_id=5470&atid=305470
The idea is that functions can be declared as METH_O, instead of METH_VARARGS. I also offer METH_l, but this is currently not used. The approach could be extended to other signatures, e.g. METH_O_opt_O (i.e. "O|O"). Some signatures cannot be changed into special-calls, e.g. "O!", or "ll|l".
[benchmark] So the speed-up is roughly 30% to 50%, depending on how much work the function has to do.
Please let me know what you think.
Great idea, Martin. One suggestion though: I would change is the way the function is "declared" in the method list. Your currently use: {"append", (PyCFunction)listappend, METH_O, append_doc}, Now this would be more flexible if you would implement a scheme which lets us put the parser string into the method list. The call mechanism could then easily figure out how to call the method and it would also be more easily extensible: {"append", (PyCFunction)listappend, METH_DIRECT, append_doc, "O"}, This would then (just like in your patch) call the listappend function with the parser arguments inlined into the C call: listappend(self, arg0) A parser marker "OO" would then call a method like this: method(self, arg0, arg1) and so on. This approach costs a little more (the string compare), but should provide a more direct way of converting existing functions to the new convention (just copy&paste the PyArg_ParseTuple() argument) and also allows implementing a generic scheme which then again relies on PyArg_ParseTuple() to do the argument parsing, e.g. "is#" could be implemented as: PyObject *method(PyObject self, int arg0, char *arg1, int *arg1_len) For optional arguments we'd need some convention which then lets the called function add the default value as needed. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
Now this would be more flexible if you would implement a scheme which lets us put the parser string into the method list. The call mechanism could then easily figure out how to call the method and it would also be more easily extensible:
{"append", (PyCFunction)listappend, METH_DIRECT, append_doc, "O"},
I'd like to hear other people's comment on this specific issue, so I guess I should probably write a PEP outlining the options. My immediate reaction to your proposal is that it only complicates the interface without any savings. We still can only support a limited number of calling conventions. E.g. it is not possible to write portable C code that does all the calling conventions for "l", "ll", "lll", "llll", and so on - you have to cast the function pointer to the right prototype, which must be done in source code. So with this interface, you may end up at run-time finding out that you cannot support the signature. With the current patch, you'd have to know to convert "OO" into METH_OO, which I think is not asked too much - and it gives you a compile-time error if you use an unsupported calling convention.
A parser marker "OO" would then call a method like this:
method(self, arg0, arg1)
and so on.
That is indeed the plan, but since you have to code the parameter combinations in C code, you can only support so many of them.
allows implementing a generic scheme which then again relies on PyArg_ParseTuple() to do the argument parsing, e.g. "is#" could be implemented as:
The point of the patch is to get rid of PyArg_ParseTuple in the "common case". For functions with complex calling conventions, getting rid of the PyArg_ParseTuple string parsing is not that important, since they are expensive, anyway (not that "is#" couldn't be supported, I'd call it METH_is_hash).
For optional arguments we'd need some convention which then lets the called function add the default value as needed.
For the moment, I'd only support "|O", and perhaps "|z"; an omitted argument would be represented as a NULL pointer. That means that "|i" couldn't participate in the fast calling convention - unless we translate that to void foo(PyObject*self, int i, bool ipresent); BTW, the most frequent function in my measurements that would make use of this convention is "OO|i:replace", which scores at 4.5%. Regards, Martin
"Martin v. Loewis" wrote:
Now this would be more flexible if you would implement a scheme which lets us put the parser string into the method list. The call mechanism could then easily figure out how to call the method and it would also be more easily extensible:
{"append", (PyCFunction)listappend, METH_DIRECT, append_doc, "O"},
I'd like to hear other people's comment on this specific issue, so I guess I should probably write a PEP outlining the options.
My immediate reaction to your proposal is that it only complicates the interface without any savings. We still can only support a limited number of calling conventions. E.g. it is not possible to write portable C code that does all the calling conventions for "l", "ll", "lll", "llll", and so on - you have to cast the function pointer to the right prototype, which must be done in source code.
So with this interface, you may end up at run-time finding out that you cannot support the signature. With the current patch, you'd have to know to convert "OO" into METH_OO, which I think is not asked too much - and it gives you a compile-time error if you use an unsupported calling convention.
True. It's unfortunate that C doesn't offer the reverse of varargs.h...
A parser marker "OO" would then call a method like this:
method(self, arg0, arg1)
and so on.
That is indeed the plan, but since you have to code the parameter combinations in C code, you can only support so many of them.
allows implementing a generic scheme which then again relies on PyArg_ParseTuple() to do the argument parsing, e.g. "is#" could be implemented as:
The point of the patch is to get rid of PyArg_ParseTuple in the "common case". For functions with complex calling conventions, getting rid of the PyArg_ParseTuple string parsing is not that important, since they are expensive, anyway (not that "is#" couldn't be supported, I'd call it METH_is_hash).
For optional arguments we'd need some convention which then lets the called function add the default value as needed.
For the moment, I'd only support "|O", and perhaps "|z"; an omitted argument would be represented as a NULL pointer. That means that "|i" couldn't participate in the fast calling convention - unless we translate that to
void foo(PyObject*self, int i, bool ipresent);
BTW, the most frequent function in my measurements that would make use of this convention is "OO|i:replace", which scores at 4.5%.
I was thinking of using pointer indirection for this: foo(PyObject *self, int *i) If i is given as argument, *i is set to the value, otherwise i is set to NULL. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
"Martin v. Loewis" wrote:
I was thinking of using pointer indirection for this:
foo(PyObject *self, int *i)
If i is given as argument, *i is set to the value, otherwise i is set to NULL.
That is a good idea; I'll try to update my patch to more calling conventions.
This morning another idea popped up which could help us with handling generic callings schemes: How about making *all* parameters pointers ?! The calling mechanism would then just have to deal with an changing number of parameters and not with different types (this is how PyArg_ParseTuple() works too if I remember correctly). We could easily provide calling schemes for 1 - n arguments that way and the types of these arguments would be defined by the parser string just like before. Examples: foo(PyObject *self, PyObject *obj, int *i) bar(PyObject *self, int *i, int *j, char *txt, int *len) To call these, the calling mechanism would have to cast these to: foo(void *, void *, void *) bar(void *, void *, void *, void *, void *) Wouldn't this work ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
To call these, the calling mechanism would have to cast these to:
foo(void *, void *, void *) bar(void *, void *, void *, void *, void *)
Wouldn't this work ?
I think it would work, but I doubt it would save much compared to the existing approach. The main point of this patch is to improve efficiency, and (according to Jeremy's analysis), most of the time for calling a function is spend in PyArg_ParseTuple. So if we replace it with another interface that also relies on parsing a string, I doubt we'll improve efficiency. IOW, I won't implement that approach. If you do, I'd be curious to hear the results, of course. Regards, Martin P.S. There would be still cases where PyArg_ParseTuple is needed, e.g. for "O!".
"Martin v. Loewis" wrote:
To call these, the calling mechanism would have to cast these to:
foo(void *, void *, void *) bar(void *, void *, void *, void *, void *)
Wouldn't this work ?
I think it would work, but I doubt it would save much compared to the existing approach. The main point of this patch is to improve efficiency, and (according to Jeremy's analysis), most of the time for calling a function is spend in PyArg_ParseTuple. So if we replace it with another interface that also relies on parsing a string, I doubt we'll improve efficiency.
That's the point: we are not replacing PyArg_ParseTuple() with another parsing mechanism, we are only using PyArg_ParseTuple() as fallback solution for parser strings for which we don't provide a special case implementation. The idea is to simply do a strcmp() (*) for a few common combinations (like e.g. "O" and "OO") and then provide the same special case handling like you do with e.g. METH_O. The result would be almost the same w/r to performance and code reduction as with your approach. The only addition would be using strcmp() instead of a switch statement. The advantage of this approach is that while you can still provide special case handling of common parser strings, you can also provide generic APIs for most other parser strings by reverting to PyArg_ParseTuple() for these.
IOW, I won't implement that approach. If you do, I'd be curious to hear the results, of course.
I'll see what I can do...
P.S. There would be still cases where PyArg_ParseTuple is needed, e.g. for "O!".
True... can't win 'em all ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
"MvL" == Martin v Loewis <martin@loewis.home.cs.tu-berlin.de> writes:
MvL> to the existing approach. The main point of this patch is to MvL> improve efficiency, and (according to Jeremy's analysis), most MvL> of the time for calling a function is spend in MvL> PyArg_ParseTuple. I'd like to qualify this a bit. What I reported earlier is that the BuiltinFuntionCall microbenchmark in pybench spends 30% of its time in PyArg_ParseTuple(). This strikes me as excessive, because it's a static property of the code. (One could imagine writing a Python script that parsed the "O!|is#" format strings and generated efficient, specialized C code for that format.) If we benchmark other programs, particularly those that do more work in the builtins, the relative cost of the argument processing will be lower. Jeremy
I'd like to qualify this a bit. What I reported earlier is that the BuiltinFuntionCall microbenchmark in pybench spends 30% of its time in PyArg_ParseTuple(). This strikes me as excessive, because it's a static property of the code. (One could imagine writing a Python script that parsed the "O!|is#" format strings and generated efficient, specialized C code for that format.)
If we benchmark other programs, particularly those that do more work in the builtins, the relative cost of the argument processing will be lower.
Certainly: If the work inside the function increases, the overhead of calling it will be less visible. What the benchmark shows, however, and what my patch addresses, is that the time for *calling* a function is primarily spent in PyArg_ParseTuple (and not in, say, building argument tuples, putting parameters on the stack, fetching function addresses, building method objects, and so on). Regards, Martin
I don't want to see us duplicate the guts of PyArg_ParseTuple() inside do_call_special(). METH_O is a cool idea, METH_l is marginal, and the new code is already slower for METH_O than it needs to be in order to support the *possibility* of METH_l too (stacks and loops and switch stmts and an extra layer of do_call_special function call "just in case"). Do METH_O, convert every "O" function to use it, declare victory, and enjoy the weekend <wink>. 1%-of-the-work-for-80%-of-the-gain-and-an-overall-decrease-in-code- size-ly y'rs - tim
"TP" == Tim Peters <tim.one@home.com> writes:
TP> Do METH_O, convert every "O" function to use it, declare TP> victory, and enjoy the weekend <wink>. TP> 1%-of-the-work-for-80%-of-the-gain-and-an-overall-decrease-in-code- TP> size-ly y'rs - tim How is METH_O different than METH_OLDARGS? The old-style argument passing is definitely the most efficient for functions of a zero or one arguments. There's special-case code in ceval to support it these cases -- fast_cfunction() -- primarily because in these cases the function can be invoked by using arguments directly from the Python stack instead of copying them to a tuple first. Jeremy
[Jeremy]
How is METH_O different than METH_OLDARGS?
I have no idea: can you explain it? The #define's for these symbols are uncommented, and it's a mystery to me what they're *supposed* to mean.
The old-style argument passing is definitely the most efficient for functions of a zero or one arguments. There's special-case code in ceval to support it these cases -- fast_cfunction() -- primarily because in these cases the function can be invoked by using arguments directly from the Python stack instead of copying them to a tuple first.
OK, I'm looking in bltinmodule.c, at builtin_len. It starts like so: static PyObject * builtin_len(PyObject *self, PyObject *args) { PyObject *v; long res; if (!PyArg_ParseTuple(args, "O:len", &v)) return NULL; So it's clearly expecting a tuple. But its entry in the builtin_methods[] table is: {"len", builtin_len, 1, len_doc}, That is, it says nothing about the calling convention. Since C fills in a 0 for missing values, and methodobject.c has /* Flag passed to newmethodobject */ #define METH_OLDARGS 0x0000 #define METH_VARARGS 0x0001 #define METH_KEYWORDS 0x0002 then doesn't the stuct for builtin_len implicitly specify METH_OLDARGS? But if that's true, and fast_cfunction() does not create a tuple in this case, how is that builtin_len gets a tuple? Something doesn't add up here. Or does it? There's no *reference* to METH_OLDARGS anywhere in the code base other than its definition and its use in method tables, so whatever code *keys* off it must be assuming a hardcoded 0 value for it -- or indeed nothing keys off it at all. I expect this line in ceval.c is doing the dirty assumption: } else if (flags == 0) { and should be testing against METH_OLDARGS instead. But I see that builtin_len is falling into the METH_VARARGS case despite that it wasn't declared that way and that it sure looks like METH_OLDARGS (0) is the default. Confusing! Fix it <wink>.
[Tim, thrashing]
... So it's clearly expecting a tuple. But its entry in the builtin_methods[] table is:
{"len", builtin_len, 1, len_doc},
That is, it says nothing about the calling convention.
Oops, it does, using a hardcoded 1 instead of the METH_VARARGS #define. So that explains that. Next question: why isn't builtin_len using METH_OLDARGS instead? Is there some advantage to using METH_VARARGS in this case? This gets back to what these #defines are intended to *mean*, and I still haven't figured that out.
On Sun, 27 May 2001, Tim Peters wrote:
Next question: why isn't builtin_len using METH_OLDARGS instead? Is there some advantage to using METH_VARARGS in this case?
So you can't do
len(1,2) 2
a la list.append, socket.connect pre 2.0? (or was it 1.6?) My imprssion is that generally METH_VARARGS is saner than METH_OLDARGS (ie. more consistent). It seems the proposed METH_O is basically METH_OLDARGS + the restriction that there is in fact only one argument, so we save a tuple allocation over METH_VARARGS, but get argument count checking over METH_OLDARGS. Cheers, M.
[Tim]
Next question: why isn't builtin_len using METH_OLDARGS instead? Is there some advantage to using METH_VARARGS in this case?
[Michael Hudson]
So you can't do
len(1,2) 2
a la list.append, socket.connect pre 2.0? (or was it 1.6?)
If I didn't know better, I'd suspect Python's internal calling conventions at the start didn't perfectly anticipate all future developements. Among other things, looks like it's impossible for a METH_OLDARGS function to distinguish between being called with more than one argument and being called with a single tuple argument.
My imprssion is that generally METH_VARARGS is saner than METH_OLDARGS (ie. more consistent).
Yes, METH_OLDARGS does appear to, well, suck.
It seems the proposed METH_O is basically METH_OLDARGS + the restriction that there is in fact only one argument, so we save a tuple allocation over METH_VARARGS,
Also, and more importantly, save the PyArg_ParseTuple call on the receiving end.
but get argument count checking over METH_OLDARGS.
Which is worth getting. I'm back to where I started here: Do METH_O, convert every "O" function to use it, declare victory, and enjoy the weekend. 1%-of-the-work-for-80%-of-the-gain-and-an-overall-decrease-in-code- size-ly y'rs - tim PS: But today I'll add another: add at least one comment to the code -- this stuff is a bitch to reverse-engineer.
On Sun, May 27, 2001 at 06:49:38PM -0400, Tim Peters wrote:
1%-of-the-work-for-80%-of-the-gain-and-an-overall-decrease-in-code- size-ly y'rs - tim
And recycle a quote a day ;)
PS: But today I'll add another: add at least one comment to the code -- this stuff is a bitch to reverse-engineer.
But not just any comment, please! The Pine sourcecode is riddled with calls to 'mm_critical(stream)', and each call I've seen so far is nicely commented with the utterly useless comment '/* go critical */'. I'd-gladly-trade-in-every-mm_critical-comment-for-one-comment-to-describe- -what-Pine-actually-tries-to-do-ly y'rs, -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
On Sun, May 27, 2001 at 10:32:48PM +0100, Michael Hudson wrote:
On Sun, 27 May 2001, Tim Peters wrote:
Next question: why isn't builtin_len using METH_OLDARGS instead? Is there some advantage to using METH_VARARGS in this case?
So you can't do
len(1,2) 2
a la list.append, socket.connect pre 2.0? (or was it 1.6?)
And don't forget the method-specific errormessage by passing ':len' in the format string. Of course, this can easily be (and probably should) done by passing another argument to whatever parses arguments in METH_O, rather than invoking string parsing magic every call. -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
[Thomas Wouters]
And don't forget the method-specific errormessage by passing ':len' in the format string. Of course, this can easily be (and probably should) done by passing another argument to whatever parses arguments in METH_O, rather than invoking string parsing magic every call.
Martin's patch automatically inserts the name of the function in the TypeError it raises when a METH_O call doesn't get exactly one argument, or gets a (one or more) keyword argument. Stick to METH_O and it's a clear win, even in this respect: there's no info in an explicit ":len" he's not already deducing, and almost all instances of "O:name" formats today are exactly the same this way: if (!PyArg_ParseTuple(args, "O:abs", &v)) if (!PyArg_ParseTuple(args, "O:callable", &v)) if (!PyArg_ParseTuple(args, "O:id", &v)) if (!PyArg_ParseTuple(args, "O:hash", &v)) if (!PyArg_ParseTuple(args, "O:hex", &v)) if (!PyArg_ParseTuple(args, "O:float", &v)) if (!PyArg_ParseTuple(args, "O:len", &v)) if (!PyArg_ParseTuple(args, "O:list", &v)) else if (!PyArg_ParseTuple(args, "O:min/max", &v)) if (!PyArg_ParseTuple(args, "O:oct", &v)) if (!PyArg_ParseTuple(args, "O:ord", &obj)) if (!PyArg_ParseTuple(args, "O:reload", &v)) if (!PyArg_ParseTuple(args, "O:repr", &v)) if (!PyArg_ParseTuple(args, "O:str", &v)) if (!PyArg_ParseTuple(args, "O:tuple", &v)) if (!PyArg_ParseTuple(args, "O:type", &v)) Those are all the ones in bltinmodule.c, and nearly all of them are called extremely frequently in *some* programs. The only oddball is min/max, but then it supports more than one call-list format and so isn't a METH_O candidate anyway. Indeed, Martin's patch gives a *better* message than we get for some mistakes today:
len(val=2) Yraceback (most recent call last): File "<stdin>", line 1, in ? TypeError: len() takes exactly 1 argument (0 given)
Martin's would say TypeError: len takes no keyword arguments in this case. He should add "()" after the function name. He should also throw away the half of the patch complicating and slowing METH_O to get some theoretical speedup in other cases: make the one-arg builtins fly just as fast as humanly possible.
Tim Peters wrote:
[Thomas Wouters]
And don't forget the method-specific errormessage by passing ':len' in the format string. Of course, this can easily be (and probably should) done by passing another argument to whatever parses arguments in METH_O, rather than invoking string parsing magic every call.
Martin's patch automatically inserts the name of the function in the TypeError it raises when a METH_O call doesn't get exactly one argument, or gets a (one or more) keyword argument.
Stick to METH_O and it's a clear win, even in this respect: there's no info in an explicit ":len" he's not already deducing, and almost all instances of "O:name" formats today are exactly the same this way:
if (!PyArg_ParseTuple(args, "O:abs", &v)) if (!PyArg_ParseTuple(args, "O:callable", &v)) if (!PyArg_ParseTuple(args, "O:id", &v)) if (!PyArg_ParseTuple(args, "O:hash", &v)) if (!PyArg_ParseTuple(args, "O:hex", &v)) if (!PyArg_ParseTuple(args, "O:float", &v)) if (!PyArg_ParseTuple(args, "O:len", &v)) if (!PyArg_ParseTuple(args, "O:list", &v)) else if (!PyArg_ParseTuple(args, "O:min/max", &v)) if (!PyArg_ParseTuple(args, "O:oct", &v)) if (!PyArg_ParseTuple(args, "O:ord", &obj)) if (!PyArg_ParseTuple(args, "O:reload", &v)) if (!PyArg_ParseTuple(args, "O:repr", &v)) if (!PyArg_ParseTuple(args, "O:str", &v)) if (!PyArg_ParseTuple(args, "O:tuple", &v)) if (!PyArg_ParseTuple(args, "O:type", &v))
Those are all the ones in bltinmodule.c, and nearly all of them are called extremely frequently in *some* programs. The only oddball is min/max, but then it supports more than one call-list format and so isn't a METH_O candidate anyway. Indeed, Martin's patch gives a *better* message than we get for some mistakes today:
len(val=2) Yraceback (most recent call last): File "<stdin>", line 1, in ? TypeError: len() takes exactly 1 argument (0 given)
Martin's would say
TypeError: len takes no keyword arguments
in this case. He should add "()" after the function name. He should also throw away the half of the patch complicating and slowing METH_O to get some theoretical speedup in other cases: make the one-arg builtins fly just as fast as humanly possible.
If we end up only optimizing the re.match("O+") case, we wouldn't need the METH_SPECIAL masks; a simple METH_OBJARGS flag would do the trick and Martin could call the underlying API with one or more PyObject* taken directly from the Python VM stack. In that case, please consider at least supporting "O", "OO" and "OOO" with optional arguments treated like I suggested in an earlier posting (simply pass NULL and let the API take care of assigning a default value). This would take care of most builtins: Python/bltinmodule.c: -- if (!PyArg_ParseTuple(args, "OO:filter", &func, &seq)) -- if (!PyArg_ParseTuple(args, "OO:cmp", &a, &b)) -- if (!PyArg_ParseTuple(args, "OO:coerce", &v, &w)) -- if (!PyArg_ParseTuple(args, "OO:divmod", &v, &w)) -- if (!PyArg_ParseTuple(args, "OO|O:getattr", &v, &name, &dflt)) -- if (!PyArg_ParseTuple(args, "OO:hasattr", &v, &name)) -- if (!PyArg_ParseTuple(args, "OOO:setattr", &v, &name, &value)) -- if (!PyArg_ParseTuple(args, "OO:delattr", &v, &name)) -- if (!PyArg_ParseTuple(args, "OO|O:pow", &v, &w, &z)) -- if (!PyArg_ParseTuple(args, "OO|O:reduce", &func, &seq, &result)) -- if (!PyArg_ParseTuple(args, "OO:isinstance", &inst, &cls)) -- if (!PyArg_ParseTuple(args, "OO:issubclass", &derived, &cls)) -- if (!PyArg_ParseTuple(args, "O:abs", &v)) -- if (!PyArg_ParseTuple(args, "O|OO:apply", &func, &alist, &kwdict)) -- if (!PyArg_ParseTuple(args, "O:callable", &v)) -- if (!PyArg_ParseTuple(args, "O|O:complex", &r, &i)) -- if (!PyArg_ParseTuple(args, "O:id", &v)) -- if (!PyArg_ParseTuple(args, "O:hash", &v)) -- if (!PyArg_ParseTuple(args, "O:hex", &v)) -- if (!PyArg_ParseTuple(args, "O:float", &v)) -- if (!PyArg_ParseTuple(args, "O|O:iter", &v, &w)) -- if (!PyArg_ParseTuple(args, "O:len", &v)) -- if (!PyArg_ParseTuple(args, "O:list", &v)) -- if (!PyArg_ParseTuple(args, "O|OO:slice", &start, &stop, &step)) -- else if (!PyArg_ParseTuple(args, "O:min/max", &v)) -- if (!PyArg_ParseTuple(args, "O:oct", &v)) -- if (!PyArg_ParseTuple(args, "O:ord", &obj)) -- if (!PyArg_ParseTuple(args, "O:reload", &v)) -- if (!PyArg_ParseTuple(args, "O:repr", &v)) -- if (!PyArg_ParseTuple(args, "O:str", &v)) -- if (!PyArg_ParseTuple(args, "O:tuple", &v)) -- if (!PyArg_ParseTuple(args, "O:type", &v)) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
[MAL]
If we end up only optimizing the re.match("O+") case, we wouldn't need the METH_SPECIAL masks; a simple METH_OBJARGS flag would do the trick and Martin could call the underlying API with one or more PyObject* taken directly from the Python VM stack.
How then does the callee know it was called with the correct # of arguments? By adding enough pointer arguments to cover the longest possible O+ string plus 1, then verifying that the one just beyond the last one it expects is NULL, while the ones before that are not? Adding another "# of arguments" member to the method table? Inventing METH_O, METH_OO, METH_OOO, ...?
In that case, please consider at least supporting "O", "OO" and "OOO" with optional arguments treated like I suggested in an earlier posting (simply pass NULL and let the API take care of assigning a default value).
This would take care of most builtins:
You don't have to convince me that cases other than plain "O" exist. What's missing is data in support of the idea that calls to those are relatively frequent enough that it's a NET win to slow plain "O" in order to speed the additional cases when they happen. For example, it's not possible for calls to reduce() to have a high hit rate in real life, because builtin_reduce is a very expensive function -- there's only so many of those you can cram into a second even if the calling overhead is 0. OTOH, add a single branch to the time it takes to find builtin_type and you've slowed its *total* execution time significantly. The implementation of METH_O alone is a pure win by any measure. So would be implementing METH_OO alone, or METH_OOO alone, etc. Mix them, and they all get slower than they could have been. All the data we have says METH_O is the single most important case, and that jibes with common sense, so I believe it. If you want to speed everything, fine, do that, but that likely requires a preprocessing phase so that type signatures don't have to be resolved at runtime at all. So long as we're just looking at simple hacks, "the simpler the better" is good advice and should rule in the absence of compelling evidence against it.
participants (6)
-
Jeremy Hylton
-
M.-A. Lemburg
-
Martin v. Loewis
-
Michael Hudson
-
Thomas Wouters
-
Tim Peters