2010/10/26 M.-A. Lemburg <mal@egenix.com>
Cesare Di Mauro wrote:
> 2010/10/26 M.-A. Lemburg <mal@egenix.com>
 > I was referring to the solution (which I prefer) that I proposed answering
> to Greg, two days ago.
> Unfortunately, the stack must be used whatever the solution we will use.
> Pushing the "final" tuple and/or dictionary is a possible optimization, but
> we can use it only when we have a tuple or dict of constants; otherwise we
> need to use the stack.
> Good case:  f(1, 2, 3, a = 1, b = 2)
> We can push (1, 2, 3) tuple and {'a' : 1, 'b' : 2}, then calling f with
> CALL_FUNCTION_VAR_KW opcode passing narg = nkarg = 0.
> Worst case: f(1, x, 3, a = x, b = 2)
> We can't push the tuple and dict as a whole, because they need first to be
> built using the stack.
> The good case is possible, and I have already done some work in wpython
> collecting constants on parameters push (even partial constant sequences),
> but some additional work must be done recognizing costants-only tuple /
> dict.
> However, the worst case rest unresolved.

I don't understand. What is the difference between pushing values
on the stack and building a tuple/dict and then pushing those on
the stack ?

In your worst case example, the compiler would first build
a tuple/dict using the args already on the stack (BUILD_TUPLE,
BUILD_MAP) and then call the function with this tuple/dict
combination - you'd basically move the tuple/dict building to
the compiler rather than having the CALL* opcodes do this

It would essentially run:

f(*(1,x,3), **{'a':x, 'b':2})

and bypass the "max. number of opcode args" limit without
degrading performance, since BUILD_TUPLE et al. essentially
run the same code for building the call arguments as the
helpers for calling a function.

Marc-Andre Lemburg

Yes, the idea is to let the compiler emit proper code to build the tuple/dict, instead of using the CALL_* to do it, in order to bypass the current limits.

That's if we don't want to change the current CALL_* behavior, so speeding up the common cases and introducing a slower (but working) path for the uncommon ones.

Another solution can be to introduce a specific opcode, but I don't see it well if the purpose is just to permit more than 255 arguments.

At this time I have no other ideas to solve this problem.

Please, let me know if there's interest on a new patch to implement the "compiler-based" solution.