New 3.x restriction on number of keyword arguments
One of the use cases for named tuples is to have them be automatically created from a SQL query or CSV header. Sometimes (but not often), those can have a huge number of columns. In Python 2.x, it worked just fine -- we had a test for a named tuple with 5000 fields. In Python 3.x, there is a SyntaxError when there are more than 255 fields. The origin of the change was a hack to fit positional argument counts and keyword-only argument counts in a single oparg in the python opcode encoding. ISTM, this is an implementation specific hack and there is no reason that other implementations would have the same restriction (unless their starting point is Python's bytecode). The good news is that long argument lists are uncommon. They probably only arise in cases with dynamically created functions and classes. Most people are unaffected. The bad news is that an implementation detail has become visible and added a language restriction. The 255 limit seems weird to me in a version of Python that has gone to lengths to unify ints and longs so that char/short/long boundaries stop manifesting themselves to users. Is there any support here for trying to get smarter about the keyword-only argument implementation? The 255 limit does not seem unreasonably low, but then it was once thought that no one would ever need more that 640k of ram. If the new restriction isn't necessary, it would be great to remove it. Raymond
On 17/09/2010 21:00, Raymond Hettinger wrote:
One of the use cases for named tuples is to have them be automatically created from a SQL query or CSV header. Sometimes (but not often), those can have a huge number of columns. In Python 2.x, it worked just fine -- we had a test for a named tuple with 5000 fields. In Python 3.x, there is a SyntaxError when there are more than 255 fields.
The origin of the change was a hack to fit positional argument counts and keyword-only argument counts in a single oparg in the python opcode encoding.
ISTM, this is an implementation specific hack and there is no reason that other implementations would have the same restriction (unless their starting point is Python's bytecode).
The good news is that long argument lists are uncommon. They probably only arise in cases with dynamically created functions and classes. Most people are unaffected.
The bad news is that an implementation detail has become visible and added a language restriction. The 255 limit seems weird to me in a version of Python that has gone to lengths to unify ints and longs so that char/short/long boundaries stop manifesting themselves to users.
Is there any support here for trying to get smarter about the keyword-only argument implementation? The 255 limit does not seem unreasonably low, but then it was once thought that no one would ever need more that 640k of ram. If the new restriction isn't necessary, it would be great to remove it.
Strings can be any length, lists can be any length, even the humble int can be any length! It does seem unPythonic to have a low limit like that. I think that the implementation hack needs a bit of a rethink if that's what it's causing, IMHO.
On 17Sep2010 21:23, MRAB <python@mrabarnett.plus.com> wrote: | On 17/09/2010 21:00, Raymond Hettinger wrote: | >One of the use cases for named tuples is to have them be | >automatically created from a SQL query or CSV header. Sometimes (but | >not often), those can have a huge number of columns. In Python 2.x, | >it worked just fine -- we had a test for a named tuple with 5000 | >fields. In Python 3.x, there is a SyntaxError when there are more | >than 255 fields. | > | >The origin of the change was a hack to fit positional argument counts | >and keyword-only argument counts in a single oparg in the python | >opcode encoding. [...] | >Is there any support here for trying to get smarter about the | >keyword-only argument implementation? [...] | | Strings can be any length, lists can be any length, even the humble int | can be any length! | It does seem unPythonic to have a low limit like that. A big +10 from me. Implementation internals should not cause language level limitations. If there's a (entirely reasonable IMHO) desire to get the opcode small, the count should be encoded in a compact be extendable form. (I speak here with no idea how inflexible the opcode readers are.) As an example, I use a personal encoding for natural numbers scheme where values below 128 fit in one byte, 128 or more set the top bit on leading bytes to indicate followon bytes, so values up to 16383 fit in two bytes and so on arbitrarily. Compact and simple but unbounded. Is something like that tractable for the Python opcodes? Cheers, -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ I am returning this otherwise good typing paper to you because someone has printed gibberish all over it and put your name at the top. - English Professor, Ohio University
On Sat, 18 Sep 2010 07:05:46 +1000 Cameron Simpson <cs@zip.com.au> wrote:
As an example, I use a personal encoding for natural numbers scheme where values below 128 fit in one byte, 128 or more set the top bit on leading bytes to indicate followon bytes, so values up to 16383 fit in two bytes and so on arbitrarily. Compact and simple but unbounded.
Well, you are proposing that we (Python core maintainers) live with additional complication in one of the most central and critical parts of the interpreter, just so that we satisfy some theoretical impulse for "consistency". That doesn't sound reasonable. (and, sure, the variable-length encoding wouldn't be very complicated; it would still be more complicated than it needs to be, and that's already a problem) For the record, have you been hit by this problem, or do you even think you might be hit by it in the near future? Thank you Antoine.
On 17Sep2010 23:21, Antoine Pitrou <solipsis@pitrou.net> wrote: | On Sat, 18 Sep 2010 07:05:46 +1000 | Cameron Simpson <cs@zip.com.au> wrote: | > As an example, I use a personal encoding for natural numbers scheme | > where values below 128 fit in one byte, 128 or more set the top bit on | > leading bytes to indicate followon bytes, so values up to 16383 fit in | > two bytes and so on arbitrarily. Compact and simple but unbounded. | | Well, you are proposing that we (Python core maintainers) live with | additional complication in one of the most central and critical parts of | the interpreter, just so that we satisfy some theoretical impulse for | "consistency". That doesn't sound reasonable. [...] | For the record, have you been hit by this problem, or do you even think | you might be hit by it in the near future? Me, no. But arbitrary _syntactic_ constraints in an otherwise flexible language grate. I was only suggesting a compactness-supporting approach, not lobbying very hard for making the devs use it. I'm +10 on removing the syntactic constraint, not on hacking the opcode definitons. Cheers, -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ Withdrawing in disgust is not the same as conceding. - Jon Adams <jadams@sea06f.sea06.navy.mil>
Cameron Simpson wrote:
If there's a (entirely reasonable IMHO) desire to get the opcode small, the count should be encoded in a compact be extendable form.
I suspect it's more because it was easier to do it that way than to track down all the places that assume a bytecode never has more than one 16-bit operand. -- Greg
On Fri, 17 Sep 2010 13:00:08 -0700 Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
One of the use cases for named tuples is to have them be automatically created from a SQL query or CSV header. Sometimes (but not often), those can have a huge number of columns. In Python 2.x, it worked just fine -- we had a test for a named tuple with 5000 fields. In Python 3.x, there is a SyntaxError when there are more than 255 fields.
I don't understand your explanation. You can't pass a namedtuple using the **kw convention:
import collections T = collections.namedtuple('a', 'b c d') t = T(1,2,3) def f(**a): pass ... f(**t) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: f() argument after ** must be a mapping, not a
Besides, even if that worked, you are doing an intermediate conversion to a dict, which is wasteful. Why not simply pass the namedtuple as a regular parameter?
The bad news is that an implementation detail has become visible and added a language restriction. The 255 limit seems weird to me in a version of Python that has gone to lengths to unify ints and longs so that char/short/long boundaries stop manifesting themselves to users.
Well, it sounds like a theoretical worry of no practical value to me. The **kw notation is meant to marshal passing of actual keyword args, which are going to be explicitly typed in either at the call site or at the function definition site (ignoring any proxies in-between). Nobody is going to type more than 255 keyword arguments by hand. And there's generated code, but since it's generated they can easily find a workaround anyway.
If the new restriction isn't necessary, it would be great to remove it.
I assume the restriction is useful since, according to your explanation, it improves the encoding of opcodes. Of course, we could switch bytecode to use a standard 32-bit word size, but someone has to propose a patch. Regards Antoine.
On Sat, Sep 18, 2010 at 7:11 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Fri, 17 Sep 2010 13:00:08 -0700 Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
One of the use cases for named tuples is to have them be automatically created from a SQL query or CSV header. Sometimes (but not often), those can have a huge number of columns. In Python 2.x, it worked just fine -- we had a test for a named tuple with 5000 fields. In Python 3.x, there is a SyntaxError when there are more than 255 fields.
I don't understand your explanation. You can't pass a namedtuple using the **kw convention:
But you do need to *initialise* the named tuple after you create it. If it's a big tuple, then all of those field values need to be passed in either as positional arguments or as keyword arguments. A restriction to 255 parameters means that named tuples with more than 255 fields become a lot less useful. Merging the parameter count into the opcode as an optimisation when the number of parameters is < 256 is fine. *Disallowing* parameter counts >= 255 is not. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 9/17/2010 4:00 PM, Raymond Hettinger wrote:
One of the use cases for named tuples is to have them be automatically created from a SQL query or CSV header. Sometimes (but not often), those can have a huge number of columns. In Python 2.x, it worked just fine -- we had a test for a named tuple with 5000 fields. In Python 3.x, there is a SyntaxError when there are more than 255 fields.
So, when the test failed due to the code change, the test was simply removed?
The origin of the change was a hack to fit positional argument counts and keyword-only argument counts in a single oparg in the python opcode encoding.
I do not remember any discussion of adding such a language restriction, though I could have forgotten or missed it. As near as I can tell, it is undocumented. While there are undocumented limits to the interpreter, like nesting depth, this one is so low that I would consider the discrepancy between doc and behavior a bug. -- Terry Jan Reedy
On Sep 17, 2010, at 4:00 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote: ..
Is there any support here for trying to get smarter about the keyword-only argument implementation? The 255 limit does not seem unreasonably low, but then it was once thought that no one would ever need more that 640k of ram. If the new restriction isn't necessary, it would be great to remove
This has been requested before, but rejected for the lack of a valid use case. See issue 1636. I think supporting huge named tuples for the benefit of database applications is a valid use case.
On Fri, Sep 17, 2010 at 1:00 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
One of the use cases for named tuples is to have them be automatically created from a SQL query or CSV header. Sometimes (but not often), those can have a huge number of columns. In Python 2.x, it worked just fine -- we had a test for a named tuple with 5000 fields. In Python 3.x, there is a SyntaxError when there are more than 255 fields.
The origin of the change was a hack to fit positional argument counts and keyword-only argument counts in a single oparg in the python opcode encoding.
ISTM, this is an implementation specific hack and there is no reason that other implementations would have the same restriction (unless their starting point is Python's bytecode).
The good news is that long argument lists are uncommon. They probably only arise in cases with dynamically created functions and classes. Most people are unaffected.
The bad news is that an implementation detail has become visible and added a language restriction. The 255 limit seems weird to me in a version of Python that has gone to lengths to unify ints and longs so that char/short/long boundaries stop manifesting themselves to users.
Is there any support here for trying to get smarter about the keyword-only argument implementation? The 255 limit does not seem unreasonably low, but then it was once thought that no one would ever need more that 640k of ram. If the new restriction isn't necessary, it would be great to remove it.
+256 on removing this limit from the language. I've come across code generators that produced quite insane-looking code that worked perfectly fine because Python's grammar has no (or very large) limits, and I consider this a language feature. I've also written code where there was a good reason to use **kwds in the function definition and another good reason to pass **kwds to the call where the kwds passed could be huge. -- --Guido van Rossum (python.org/~guido)
On 09/18/10 06:00, Raymond Hettinger wrote:
The good news is that long argument lists are uncommon. They probably only arise in cases with dynamically created functions and classes. Most people are unaffected.
How about showing a Warning when trying to create a large namedtuple? The Warning contains a reference to a bug issue, and should describe that if they really, really need to have this limitation removed, then they should ask in the bug report. Just so that we don't complicate the code unnecessarily without a real usage. In Python, classes are largely syntax sugar for a dictionary anyway, if they needed such a large namedtuple, they should probably reconsider using dictionary or list or real classes instead.
Lie Ryan wrote:
On 09/18/10 06:00, Raymond Hettinger wrote:
The good news is that long argument lists are uncommon. They probably only arise in cases with dynamically created functions and classes. Most people are unaffected.
How about showing a Warning when trying to create a large namedtuple? The Warning contains a reference to a bug issue, and should describe that if they really, really need to have this limitation removed, then they should ask in the bug report. Just so that we don't complicate the code unnecessarily without a real usage.
In Python, classes are largely syntax sugar for a dictionary anyway, if they needed such a large namedtuple, they should probably reconsider using dictionary or list or real classes instead.
+1 on removing the restriction, just because I find large namedtuples useful. I work with large tables of data and often use namedtuples for their compactness. Python dictionaries have a large memory overhead compared to tuples. This restriction could seriously hamper my future efforts to migrate to Python 3. - Tal Einat
Raymond Hettinger <raymond.hettinger@...> writes:
One of the use cases for named tuples is to have them be automatically created
from a SQL query or CSV header.
Sometimes (but not often), those can have a huge number of columns. In Python 2.x, it worked just fine -- we had a test for a named tuple with 5000 fields. In Python 3.x, there is a SyntaxError when there are more than 255 fields.
I'm not sure why you think this is new. It's been true from at least 2.5 as far as I can see.
Am 21.10.2010 16:06, schrieb Benjamin Peterson:
Raymond Hettinger <raymond.hettinger@...> writes:
One of the use cases for named tuples is to have them be automatically created
from a SQL query or CSV header.
Sometimes (but not often), those can have a huge number of columns. In Python 2.x, it worked just fine -- we had a test for a named tuple with 5000 fields. In Python 3.x, there is a SyntaxError when there are more than 255 fields.
I'm not sure why you think this is new. It's been true from at least 2.5 as far as I can see.
You must be talking of a different restriction. This snippet works fine in 2.7, but raises a SyntaxError in 3.1: exec("def f(" + ", ".join("a%d" % i for i in range(1000)) + "): pass") Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
Georg Brandl wrote:
Am 21.10.2010 16:06, schrieb Benjamin Peterson:
Raymond Hettinger <raymond.hettinger@...> writes:
One of the use cases for named tuples is to have them be automatically created
from a SQL query or CSV header.
Sometimes (but not often), those can have a huge number of columns. In Python 2.x, it worked just fine -- we had a test for a named tuple with 5000 fields. In Python 3.x, there is a SyntaxError when there are more than 255 fields.
I'm not sure why you think this is new. It's been true from at least 2.5 as far as I can see.
You must be talking of a different restriction. This snippet works fine in 2.7, but raises a SyntaxError in 3.1:
exec("def f(" + ", ".join("a%d" % i for i in range(1000)) + "): pass")
The AST code in 2.7 raises this error for function/method calls only. In 3.2, it also raises the error for function/method definitions. Looking at the AST code, the limitation appears somewhat arbitrary. There's no comment in the code suggesting a reason for the limit and it's still possible to pass in more arguments via *args and **kws - but without the built-in argument checking. Could someone provide some insight ? Note that it's not uncommon to have more than 255 possible function/method arguments in generated code, e.g. in database abstraction layers. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 21 2010)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Thu, Oct 21, 2010 at 11:41 AM, M.-A. Lemburg <mal@egenix.com> wrote: ..
Looking at the AST code, the limitation appears somewhat arbitrary. There's no comment in the code suggesting a reason for the limit and it's still possible to pass in more arguments via *args and **kws - but without the built-in argument checking.
Could someone provide some insight ?
My understanding is that the limitation comes from bytecode generation phase, not AST. See also Guido's http://bugs.python.org/issue1636#msg58760. According to Python manual section for opcodes, CALL_FUNCTION(argc) Calls a function. The low byte of argc indicates the number of positional parameters, the high byte the number of keyword parameters. On the stack, the opcode finds the keyword parameters first. For each keyword argument, the value is on top of the key. Below the keyword parameters, the positional parameters are on the stack, with the right-most parameter on top. Below the parameters, the function object to call is on the stack. Pops all function arguments, and the function itself off the stack, and pushes the return value. http://docs.python.org/dev/py3k/library/dis.html?highlight=opcode#opcode-CAL...
Alexander Belopolsky wrote:
On Thu, Oct 21, 2010 at 11:41 AM, M.-A. Lemburg <mal@egenix.com> wrote: ..
Looking at the AST code, the limitation appears somewhat arbitrary. There's no comment in the code suggesting a reason for the limit and it's still possible to pass in more arguments via *args and **kws - but without the built-in argument checking.
Could someone provide some insight ?
My understanding is that the limitation comes from bytecode generation phase, not AST.
See also Guido's http://bugs.python.org/issue1636#msg58760.
According to Python manual section for opcodes,
CALL_FUNCTION(argc)
Calls a function. The low byte of argc indicates the number of positional parameters, the high byte the number of keyword parameters. On the stack, the opcode finds the keyword parameters first. For each keyword argument, the value is on top of the key. Below the keyword parameters, the positional parameters are on the stack, with the right-most parameter on top. Below the parameters, the function object to call is on the stack. Pops all function arguments, and the function itself off the stack, and pushes the return value.
http://docs.python.org/dev/py3k/library/dis.html?highlight=opcode#opcode-CAL...
Thanks for the insight. Even with the one byte per position and keywords arguments limitation imposed by the byte code, the checks in ast.c are a bit too simple, since they apply a limit on the sum of positional and keyword args, whereas the byte code and VM can deal with up to 255 positional and 255 keyword arguments. if (nposargs + nkwonlyargs > 255) { ast_error(n, "more than 255 arguments"); return NULL; } I think this should be: if (nposargs > 255) ast_error(n, "more than 255 positional arguments"); return NULL; } if (nkwonlyargs > 255) ast_error(n, "more than 255 keyword arguments"); return NULL; } There's a patch somewhere that turns Python's VM into a 16 or 32-bit byte code machine. Perhaps it's time to have a look at that again. Do other Python implementations have such limitations ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 21 2010)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Thu, Oct 21, 2010 at 1:31 PM, M.-A. Lemburg <mal@egenix.com> wrote: ..
There's a patch somewhere that turns Python's VM into a 16 or 32-bit byte code machine. Perhaps it's time to have a look at that again.
This sounds like a reference to wpython: http://code.google.com/p/wpython/ I hope 255 argument limitation can be removed by simpler means.
Alexander Belopolsky wrote:
On Thu, Oct 21, 2010 at 1:31 PM, M.-A. Lemburg <mal@egenix.com> wrote: ..
There's a patch somewhere that turns Python's VM into a 16 or 32-bit byte code machine. Perhaps it's time to have a look at that again.
This sounds like a reference to wpython:
Indeed. That's what I was thinking of.
I hope 255 argument limitation can be removed by simpler means.
Probably, but why not take this as a chance to improve other aspects of the CPython VM as well ? Here's a presentation by Cesare Di Mauro, the author of the patch: http://wpython.googlecode.com/files/Beyond%20Bytecode%20-%20A%20Wordcode-bas... -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 21 2010)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
Hi Marc
I hope 255 argument limitation can be removed by simpler means.
Probably, but why not take this as a chance to improve other
aspects of the CPython VM as well ?
Here's a presentation by Cesare Di Mauro, the author of the patch:
http://wpython.googlecode.com/files/Beyond%20Bytecode%20-%20A%20Wordcode-bas...
-- Marc-Andre Lemburg eGenix.com
This presentation was made for wpython 1.0 alpha, which was the first release I made. Last year I released the second (and last), wpython 1.1, which carries several other changes and optimizations. You can find the new project here: http://code.google.com/p/wpython2/ and the presentation here: http://wpython2.googlecode.com/files/Cleanup%20and%20new%20optimizations%20i... Cesare
Am 21.10.2010 20:15, schrieb Benjamin Peterson:
Georg Brandl <g.brandl@...> writes:
You must be talking of a different restriction.
I assumed Raymond was talking about calling a function with > 255 args.
And I assumed Raymond was talking about defining a function with > 255 args. Whatever, both instances should be fixed. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
2010/10/21 Benjamin Peterson <benjamin@python.org>
Georg Brandl <g.brandl@...> writes:
You must be talking of a different restriction.
I assumed Raymond was talking about calling a function with > 255 args.
I think that having max 255 args and 255 kwargs is a good and reasonable limit which we can live on, and helps the virtual machine implementation (and implementors :P). Python won't lose its "power" and "generality" if one VM (albeit the "mainstream" / "official" one) have some limits. We already have some other ones, such as max 65536 constants, names, globals and locals. Another one is the maximum 20 blocks for code object. Who thinks that such limits must be removed? I think that having more than 255 arguments for a function call is a very rare case for which a workaround (may be passing a tuple/list or a dictionary) can be a better solution than having to introduce a brand new opcode to handle it. Changing the current opcode(s) is a very bad idea, since common cases will slow down. Cesare
On 22/10/2010 08:18, Cesare Di Mauro wrote:
2010/10/21 Benjamin Peterson <benjamin@python.org <mailto:benjamin@python.org>>
Georg Brandl <g.brandl@...> writes:
> You must be talking of a different restriction.
I assumed Raymond was talking about calling a function with > 255 args.
I think that having max 255 args and 255 kwargs is a good and reasonable limit which we can live on, and helps the virtual machine implementation (and implementors :P).
Python won't lose its "power" and "generality" if one VM (albeit the "mainstream" / "official" one) have some limits.
We already have some other ones, such as max 65536 constants, names, globals and locals. Another one is the maximum 20 blocks for code object. Who thinks that such limits must be removed?
The BDFL thinks that 255 is too low.
I think that having more than 255 arguments for a function call is a very rare case for which a workaround (may be passing a tuple/list or a dictionary) can be a better solution than having to introduce a brand new opcode to handle it.
Changing the current opcode(s) is a very bad idea, since common cases will slow down.
On Fri, 22 Oct 2010 18:44:08 +0100 MRAB <python@mrabarnett.plus.com> wrote:
On 22/10/2010 08:18, Cesare Di Mauro wrote:
I think that having max 255 args and 255 kwargs is a good and reasonable limit which we can live on, and helps the virtual machine implementation (and implementors :P).
Python won't lose its "power" and "generality" if one VM (albeit the "mainstream" / "official" one) have some limits.
We already have some other ones, such as max 65536 constants, names, globals and locals. Another one is the maximum 20 blocks for code object. Who thinks that such limits must be removed?
The BDFL thinks that 255 is too low.
The BDFL can propose a patch :) Cheers Antoine.
Cesare Di Mauro wrote:
2010/10/21 Benjamin Peterson <benjamin@python.org>
Georg Brandl <g.brandl@...> writes:
You must be talking of a different restriction.
I assumed Raymond was talking about calling a function with > 255 args.
I think that having max 255 args and 255 kwargs is a good and reasonable limit which we can live on, and helps the virtual machine implementation (and implementors :P).
Python won't lose its "power" and "generality" if one VM (albeit the "mainstream" / "official" one) have some limits.
We already have some other ones, such as max 65536 constants, names, globals and locals. Another one is the maximum 20 blocks for code object. Who thinks that such limits must be removed?
I think that having more than 255 arguments for a function call is a very rare case for which a workaround (may be passing a tuple/list or a dictionary) can be a better solution than having to introduce a brand new opcode to handle it.
It's certainly rare when writing applications by hand, but such limits can be reached with code generators wrapping external resources such as database query rows, spreadsheet rows, sensor data input, etc. We've had such a limit before (number of lines in a module) and that was raised for the same reason.
Changing the current opcode(s) is a very bad idea, since common cases will slow down.
I'm sure there are ways to avoid that, e.g. by using EXTENDED_ARG for such cases. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 22 2010)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
2010/10/22 M.-A. Lemburg <mal@egenix.com>
Cesare Di Mauro wrote:
I think that having more than 255 arguments for a function call is a very rare case for which a workaround (may be passing a tuple/list or a dictionary) can be a better solution than having to introduce a brand new opcode to handle it.
It's certainly rare when writing applications by hand, but such limits can be reached with code generators wrapping external resources such as database query rows, spreadsheet rows, sensor data input, etc.
We've had such a limit before (number of lines in a module) and that was raised for the same reason.
Changing the current opcode(s) is a very bad idea, since common cases will slow down.
I'm sure there are ways to avoid that, e.g. by using EXTENDED_ARG for such cases.
-- Marc-Andre Lemburg eGenix.com
I've patched Python 3.2 alpha 3 with a rough solution using EXTENDED_ARG for CALL_FUNCTION* opcodes, raising the arguments and keywords limits to 65535 maximum. I hope it'll be enough. :) In ast.c: ast_for_arguments: if (nposargs > 65535 || nkwonlyargs > 65535) { ast_error(n, "more than 65535 arguments"); return NULL; } ast_for_call: if (nargs + ngens > 65535 || nkeywords > 65535) { ast_error(n, "more than 65535 arguments"); return NULL; } In compile.c: opcode_stack_effect: #define NARGS(o) (((o) & 0xff) + ((o) >> 8 & 0xff00) + 2*(((o) >> 8 & 0xff) + ((o) >> 16 & 0xff00))) case CALL_FUNCTION: return -NARGS(oparg); case CALL_FUNCTION_VAR: case CALL_FUNCTION_KW: return -NARGS(oparg)-1; case CALL_FUNCTION_VAR_KW: return -NARGS(oparg)-2; #undef NARGS #define NARGS(o) (((o) % 256) + 2*(((o) / 256) % 256)) case MAKE_FUNCTION: return -NARGS(oparg) - ((oparg >> 16) & 0xffff); case MAKE_CLOSURE: return -1 - NARGS(oparg) - ((oparg >> 16) & 0xffff); #undef NARGS compiler_call_helper: int len; int code = 0; len = asdl_seq_LEN(args) + n; n = len & 0xff | (len & 0xff00) << 8; VISIT_SEQ(c, expr, args); if (keywords) { VISIT_SEQ(c, keyword, keywords); len = asdl_seq_LEN(keywords); n |= (len & 0xff | (len & 0xff00) << 8) << 8; } In ceval.c: PyEval_EvalFrameEx: TARGET_WITH_IMPL(CALL_FUNCTION_VAR, _call_function_var_kw) TARGET_WITH_IMPL(CALL_FUNCTION_KW, _call_function_var_kw) TARGET(CALL_FUNCTION_VAR_KW) _call_function_var_kw: { int na = oparg & 0xff | oparg >> 8 & 0xff00; int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8; call_function: int na = oparg & 0xff | oparg >> 8 & 0xff00; int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8; A quick example: s = '''def f(*Args, **Keywords): print('Got', len(Args), 'arguments and', len(Keywords), 'keywords') def g(): f(''' + ', '.join(str(i) for i in range(500)) + ', ' + ', '.join('k{} = {}'.format(i, i) for i in range(500)) + ''') g() ''' c = compile(s, '<string>', 'exec') eval(c) from dis import dis dis(g) The output is: Got 500 arguments and 500 keywords 5 0 LOAD_GLOBAL 0 (f) 3 LOAD_CONST 1 (0) 6 LOAD_CONST 2 (1) [...] 1497 LOAD_CONST 499 (498) 1500 LOAD_CONST 500 (499) 1503 LOAD_CONST 501 ('k0') 1506 LOAD_CONST 1 (0) 1509 LOAD_CONST 502 ('k1') 1512 LOAD_CONST 2 (1) [...] 4491 LOAD_CONST 999 ('k498') 4494 LOAD_CONST 499 (498) 4497 LOAD_CONST 1000 ('k499') 4500 LOAD_CONST 500 (499) 4503 EXTENDED_ARG 257 4506 CALL_FUNCTION 16905460 4509 POP_TOP 4510 LOAD_CONST 0 (None) 4513 RETURN_VALUE The dis module seems to have some problem displaying the correct extended value, but I have no time now to check and fix it. Anyway, I'm still unconvinced of the need to raise the function def/call limits. Cesare
Cesare Di Mauro wrote:
2010/10/22 M.-A. Lemburg <mal@egenix.com>
Cesare Di Mauro wrote:
I think that having more than 255 arguments for a function call is a very rare case for which a workaround (may be passing a tuple/list or a dictionary) can be a better solution than having to introduce a brand new opcode to handle it.
It's certainly rare when writing applications by hand, but such limits can be reached with code generators wrapping external resources such as database query rows, spreadsheet rows, sensor data input, etc.
We've had such a limit before (number of lines in a module) and that was raised for the same reason.
Changing the current opcode(s) is a very bad idea, since common cases will slow down.
I'm sure there are ways to avoid that, e.g. by using EXTENDED_ARG for such cases.
-- Marc-Andre Lemburg eGenix.com
I've patched Python 3.2 alpha 3 with a rough solution using EXTENDED_ARG for CALL_FUNCTION* opcodes, raising the arguments and keywords limits to 65535 maximum. I hope it'll be enough. :)
Sure, we don't have to raise it to 2**64 :-) Looks like a pretty simple fix, indeed. I wish we could get rid off all the byte shifting and div'ery use in the byte compiler - I'm pretty sure that such operations are rather slow nowadays compared to working with 16-bit or 32-bit integers and dropping the notion of taking the word "byte" in byte code literally.
In ast.c:
ast_for_arguments: if (nposargs > 65535 || nkwonlyargs > 65535) { ast_error(n, "more than 65535 arguments"); return NULL; }
ast_for_call: if (nargs + ngens > 65535 || nkeywords > 65535) { ast_error(n, "more than 65535 arguments"); return NULL; }
In compile.c:
opcode_stack_effect: #define NARGS(o) (((o) & 0xff) + ((o) >> 8 & 0xff00) + 2*(((o) >> 8 & 0xff) + ((o) >> 16 & 0xff00))) case CALL_FUNCTION: return -NARGS(oparg); case CALL_FUNCTION_VAR: case CALL_FUNCTION_KW: return -NARGS(oparg)-1; case CALL_FUNCTION_VAR_KW: return -NARGS(oparg)-2; #undef NARGS #define NARGS(o) (((o) % 256) + 2*(((o) / 256) % 256)) case MAKE_FUNCTION: return -NARGS(oparg) - ((oparg >> 16) & 0xffff); case MAKE_CLOSURE: return -1 - NARGS(oparg) - ((oparg >> 16) & 0xffff); #undef NARGS
compiler_call_helper: int len; int code = 0;
len = asdl_seq_LEN(args) + n; n = len & 0xff | (len & 0xff00) << 8; VISIT_SEQ(c, expr, args); if (keywords) { VISIT_SEQ(c, keyword, keywords); len = asdl_seq_LEN(keywords); n |= (len & 0xff | (len & 0xff00) << 8) << 8; }
In ceval.c:
PyEval_EvalFrameEx: TARGET_WITH_IMPL(CALL_FUNCTION_VAR, _call_function_var_kw) TARGET_WITH_IMPL(CALL_FUNCTION_KW, _call_function_var_kw) TARGET(CALL_FUNCTION_VAR_KW) _call_function_var_kw: { int na = oparg & 0xff | oparg >> 8 & 0xff00; int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8;
call_function: int na = oparg & 0xff | oparg >> 8 & 0xff00; int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8;
A quick example:
s = '''def f(*Args, **Keywords): print('Got', len(Args), 'arguments and', len(Keywords), 'keywords')
def g(): f(''' + ', '.join(str(i) for i in range(500)) + ', ' + ', '.join('k{} = {}'.format(i, i) for i in range(500)) + ''')
g() '''
c = compile(s, '<string>', 'exec') eval(c) from dis import dis dis(g)
The output is:
Got 500 arguments and 500 keywords
5 0 LOAD_GLOBAL 0 (f) 3 LOAD_CONST 1 (0) 6 LOAD_CONST 2 (1) [...] 1497 LOAD_CONST 499 (498) 1500 LOAD_CONST 500 (499) 1503 LOAD_CONST 501 ('k0') 1506 LOAD_CONST 1 (0) 1509 LOAD_CONST 502 ('k1') 1512 LOAD_CONST 2 (1) [...] 4491 LOAD_CONST 999 ('k498') 4494 LOAD_CONST 499 (498) 4497 LOAD_CONST 1000 ('k499') 4500 LOAD_CONST 500 (499) 4503 EXTENDED_ARG 257 4506 CALL_FUNCTION 16905460 4509 POP_TOP 4510 LOAD_CONST 0 (None) 4513 RETURN_VALUE
The dis module seems to have some problem displaying the correct extended value, but I have no time now to check and fix it.
Anyway, I'm still unconvinced of the need to raise the function def/call limits.
It may seem strange to have functions, methods or object constructors with more than 255 parameters, but as I said: when using code generators, the generators don't care whether they use 100 or 300 parameters. Even if just 10 parameters are actually used later on. However, the user will care a lot if the generators fail due such limits and then become unusable. As example, take a database query method that exposes 3-4 parameters for each query field. In more complex database schemas that you find in e.g. data warehouse applications, it is not uncommon to have 100+ query fields or columns in a data table. With the current limit in function/call argument counts, such a model could not be mapped directly to Python. Instead, you'd have to turn to solutions based on other data structures that are not automatically checked by Python when calling methods/functions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 22 2010)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Sat, 23 Oct 2010 00:36:30 +0200 "M.-A. Lemburg" <mal@egenix.com> wrote:
It may seem strange to have functions, methods or object constructors with more than 255 parameters, but as I said: when using code generators, the generators don't care whether they use 100 or 300 parameters.
Why not make the code generators smarter?
On 10/22/2010 6:45 PM, Antoine Pitrou wrote:
On Sat, 23 Oct 2010 00:36:30 +0200 "M.-A. Lemburg"<mal@egenix.com> wrote:
It may seem strange to have functions, methods or object constructors with more than 255 parameters, but as I said: when using code generators, the generators don't care whether they use 100 or 300 parameters.
Why not make the code generators smarter?
Because it makes more sense to fix it in one place than force the burden of coding around an arbitrary limit upon each such code generator. Eric.
Le lundi 25 octobre 2010 à 19:44 -0400, Eric Smith a écrit :
On 10/22/2010 6:45 PM, Antoine Pitrou wrote:
On Sat, 23 Oct 2010 00:36:30 +0200 "M.-A. Lemburg"<mal@egenix.com> wrote:
It may seem strange to have functions, methods or object constructors with more than 255 parameters, but as I said: when using code generators, the generators don't care whether they use 100 or 300 parameters.
Why not make the code generators smarter?
Because it makes more sense to fix it in one place than force the burden of coding around an arbitrary limit upon each such code generator.
Sure, but in the absence of anyone providing a patch for CPython, it is still a possible resolution. Regards Antoine.
Antoine Pitrou wrote:
Le lundi 25 octobre 2010 à 19:44 -0400, Eric Smith a écrit :
On 10/22/2010 6:45 PM, Antoine Pitrou wrote:
On Sat, 23 Oct 2010 00:36:30 +0200 "M.-A. Lemburg"<mal@egenix.com> wrote:
It may seem strange to have functions, methods or object constructors with more than 255 parameters, but as I said: when using code generators, the generators don't care whether they use 100 or 300 parameters.
Why not make the code generators smarter?
I don't see a way to work around the limitation without starting every single wrapper object's .__init__() with a test routine that checks the parameters in Python - and that's not really feasible since it would kill performance. You'd also have to move all **kws parameters to locals in order to emulate the normal Python parameter invokation of the method.
Because it makes more sense to fix it in one place than force the burden of coding around an arbitrary limit upon each such code generator.
Sure, but in the absence of anyone providing a patch for CPython, it is still a possible resolution.
Cesare already posted a patch based on using EXTENDED_ARG. Should we reopen that old ticket or create a new one ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 26 2010)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
Why not make the code generators smarter?
I don't see a way to work around the limitation without starting every single wrapper object's .__init__() with a test routine that checks the parameters in Python - and that's not really feasible since it would kill performance.
Have you considered that having 200 or 300 keyword arguments might already kill performance? I don't think our function invocation code is tuned for such a number.
Cesare already posted a patch based on using EXTENDED_ARG. Should we reopen that old ticket or create a new one ?
Was there an old ticket open? I have only seen a piece of code on python-ideas. Regardless, whether one or the other doesn't really matter, as long as it's recorded somewhere :) Regards Antoine.
2010/10/26 M.-A. Lemburg <mal@egenix.com>
Antoine Pitrou wrote:
Sure, but in the absence of anyone providing a patch for CPython, it is still a possible resolution.
Cesare already posted a patch based on using EXTENDED_ARG. Should we reopen that old ticket or create a new one ?
-- Marc-Andre Lemburg
I can provide another patch that will not use EXTENDED_ARG (no VM changes), and uses *args and/or **kwargs function calls when there are more than 255 arguments or keyword arguments. But I need some days. If needed, I'll post it at most on this week-end. Cesare
Cesare Di Mauro wrote:
2010/10/26 M.-A. Lemburg <mal@egenix.com>
Antoine Pitrou wrote:
Sure, but in the absence of anyone providing a patch for CPython, it is still a possible resolution.
Cesare already posted a patch based on using EXTENDED_ARG. Should we reopen that old ticket or create a new one ?
-- Marc-Andre Lemburg
I can provide another patch that will not use EXTENDED_ARG (no VM changes), and uses *args and/or **kwargs function calls when there are more than 255 arguments or keyword arguments.
But I need some days.
If needed, I'll post it at most on this week-end.
You mean a version that pushes the *args tuple and **kws dict on the stack and then uses those for calling the function/method ? I think that would be a lot more efficient than pushing/popping hundreds of parameters on/off the stack. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 26 2010)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
2010/10/26 M.-A. Lemburg <mal@egenix.com>
Cesare Di Mauro wrote:
I can provide another patch that will not use EXTENDED_ARG (no VM changes), and uses *args and/or **kwargs function calls when there are more than 255 arguments or keyword arguments.
But I need some days.
If needed, I'll post it at most on this week-end.
You mean a version that pushes the *args tuple and **kws dict on the stack and then uses those for calling the function/method ?
I think that would be a lot more efficient than pushing/popping hundreds of parameters on/off the stack.
-- Marc-Andre Lemburg
I was referring to the solution (which I prefer) that I proposed answering to Greg, two days ago. Unfortunately, the stack must be used whatever the solution we will use. Pushing the "final" tuple and/or dictionary is a possible optimization, but we can use it only when we have a tuple or dict of constants; otherwise we need to use the stack. Good case: f(1, 2, 3, a = 1, b = 2) We can push (1, 2, 3) tuple and {'a' : 1, 'b' : 2}, then calling f with CALL_FUNCTION_VAR_KW opcode passing narg = nkarg = 0. Worst case: f(1, x, 3, a = x, b = 2) We can't push the tuple and dict as a whole, because they need first to be built using the stack. The good case is possible, and I have already done some work in wpython collecting constants on parameters push (even partial constant sequences), but some additional work must be done recognizing costants-only tuple / dict. However, the worst case rest unresolved. Cesare
Cesare Di Mauro wrote:
2010/10/26 M.-A. Lemburg <mal@egenix.com>
Cesare Di Mauro wrote:
I can provide another patch that will not use EXTENDED_ARG (no VM changes), and uses *args and/or **kwargs function calls when there are more than 255 arguments or keyword arguments.
But I need some days.
If needed, I'll post it at most on this week-end.
You mean a version that pushes the *args tuple and **kws dict on the stack and then uses those for calling the function/method ?
I think that would be a lot more efficient than pushing/popping hundreds of parameters on/off the stack.
-- Marc-Andre Lemburg
I was referring to the solution (which I prefer) that I proposed answering to Greg, two days ago.
Unfortunately, the stack must be used whatever the solution we will use.
Pushing the "final" tuple and/or dictionary is a possible optimization, but we can use it only when we have a tuple or dict of constants; otherwise we need to use the stack.
Good case: f(1, 2, 3, a = 1, b = 2) We can push (1, 2, 3) tuple and {'a' : 1, 'b' : 2}, then calling f with CALL_FUNCTION_VAR_KW opcode passing narg = nkarg = 0.
Worst case: f(1, x, 3, a = x, b = 2) We can't push the tuple and dict as a whole, because they need first to be built using the stack.
The good case is possible, and I have already done some work in wpython collecting constants on parameters push (even partial constant sequences), but some additional work must be done recognizing costants-only tuple / dict.
However, the worst case rest unresolved.
I don't understand. What is the difference between pushing values on the stack and building a tuple/dict and then pushing those on the stack ? In your worst case example, the compiler would first build a tuple/dict using the args already on the stack (BUILD_TUPLE, BUILD_MAP) and then call the function with this tuple/dict combination - you'd basically move the tuple/dict building to the compiler rather than having the CALL* opcodes do this internally. It would essentially run: f(*(1,x,3), **{'a':x, 'b':2}) and bypass the "max. number of opcode args" limit without degrading performance, since BUILD_TUPLE et al. essentially run the same code for building the call arguments as the helpers for calling a function. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 26 2010)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
2010/10/26 M.-A. Lemburg <mal@egenix.com>
Cesare Di Mauro wrote:
2010/10/26 M.-A. Lemburg <mal@egenix.com>
I was referring to the solution (which I prefer) that I proposed answering to Greg, two days ago.
Unfortunately, the stack must be used whatever the solution we will use.
Pushing the "final" tuple and/or dictionary is a possible optimization, but we can use it only when we have a tuple or dict of constants; otherwise we need to use the stack.
Good case: f(1, 2, 3, a = 1, b = 2) We can push (1, 2, 3) tuple and {'a' : 1, 'b' : 2}, then calling f with CALL_FUNCTION_VAR_KW opcode passing narg = nkarg = 0.
Worst case: f(1, x, 3, a = x, b = 2) We can't push the tuple and dict as a whole, because they need first to be built using the stack.
The good case is possible, and I have already done some work in wpython collecting constants on parameters push (even partial constant sequences), but some additional work must be done recognizing costants-only tuple / dict.
However, the worst case rest unresolved.
I don't understand. What is the difference between pushing values on the stack and building a tuple/dict and then pushing those on the stack ?
In your worst case example, the compiler would first build a tuple/dict using the args already on the stack (BUILD_TUPLE, BUILD_MAP) and then call the function with this tuple/dict combination - you'd basically move the tuple/dict building to the compiler rather than having the CALL* opcodes do this internally.
It would essentially run:
f(*(1,x,3), **{'a':x, 'b':2})
and bypass the "max. number of opcode args" limit without degrading performance, since BUILD_TUPLE et al. essentially run the same code for building the call arguments as the helpers for calling a function.
-- Marc-Andre Lemburg
Yes, the idea is to let the compiler emit proper code to build the tuple/dict, instead of using the CALL_* to do it, in order to bypass the current limits. That's if we don't want to change the current CALL_* behavior, so speeding up the common cases and introducing a slower (but working) path for the uncommon ones. Another solution can be to introduce a specific opcode, but I don't see it well if the purpose is just to permit more than 255 arguments. At this time I have no other ideas to solve this problem. Please, let me know if there's interest on a new patch to implement the "compiler-based" solution. Cesare
On Tue, 26 Oct 2010 19:22:32 +0200 Cesare Di Mauro <cesare.di.mauro@gmail.com> wrote:
At this time I have no other ideas to solve this problem.
Please, let me know if there's interest on a new patch to implement the "compiler-based" solution.
Have you timed the EXTENDED_ARG solution? Regards Antoine.
2010/10/26 Antoine Pitrou <solipsis@pitrou.net>
On Tue, 26 Oct 2010 19:22:32 +0200 Cesare Di Mauro <cesare.di.mauro@gmail.com> wrote:
At this time I have no other ideas to solve this problem.
Please, let me know if there's interest on a new patch to implement the "compiler-based" solution.
Have you timed the EXTENDED_ARG solution?
Regards
Antoine.
I made some a few minutes ago, and the results are unbelievable and counter-intuitive on my machine (Athlon64 2800+ socket 754, 2GB DDR 400, Windows 7 x64, Python 3.2a3 32 bits running at high priority): python.exe -m timeit -r 1 -n 100000000 -s "def f(): pass" "f()" Standard : 100000000 loops, best of 1: 0.348 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.341 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z): pass" "f(1, 2, 3)" Standard : 100000000 loops, best of 1: 0.452 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.451 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(a = 1, b = 2, c = 3): pass" "f(a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 0.578 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.556 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z, a = 1, b = 2, c = 3): pass" "f(1, 2, 3, a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 0.761 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.739 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args): pass" "f(1, 2, 3)" Standard : 100000000 loops, best of 1: 0.511 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.508 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(**Keys): pass" "f(a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 0.789 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.784 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f(1, 2, 3, a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 1.01 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 1.01 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f()" Standard : 100000000 loops, best of 1: 0.393 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.41 usec per loop I really can't explain it. Ouch! Cesare
2010/10/26 Antoine Pitrou <solipsis@pitrou.net>
[snip lots of timeit results comparing unpatched and EXTENDED_ARG]
I really can't explain it. Ouch!
What do you mean exactly? There's no significant change at all.
I cannot explain why the unpatched version was slower than the patched one most of the times. I find it silly and illogical. Cesare
I cannot explain why the unpatched version was slower than the patched one most of the times.
It just looks like measurement noise or, at worse, the side effect of slightly different code generation by the compiler. I don't think a ±1% variation on a desktop computer can be considered significant. (which means that the patch reaches its goal of not decreasing performance, anyway :-)) Regards Antoine.
Am 26.10.2010 22:58, schrieb Cesare Di Mauro:
2010/10/26 Antoine Pitrou <solipsis@pitrou.net <mailto:solipsis@pitrou.net>>
> [snip lots of timeit results comparing unpatched and EXTENDED_ARG] > > I really can't explain it. Ouch!
What do you mean exactly? There's no significant change at all.
I cannot explain why the unpatched version was slower than the patched one most of the times.
I find it silly and illogical.
It rather seems that you're seeing statistics, and the impact of the change is not measurable. Nothing silly about it. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
Cesare Di Mauro wrote:
2010/10/26 Antoine Pitrou <solipsis@pitrou.net>
On Tue, 26 Oct 2010 19:22:32 +0200 Cesare Di Mauro <cesare.di.mauro@gmail.com> wrote:
At this time I have no other ideas to solve this problem.
Please, let me know if there's interest on a new patch to implement the "compiler-based" solution.
Have you timed the EXTENDED_ARG solution?
Regards
Antoine.
I made some a few minutes ago, and the results are unbelievable and counter-intuitive on my machine (Athlon64 2800+ socket 754, 2GB DDR 400, Windows 7 x64, Python 3.2a3 32 bits running at high priority):
python.exe -m timeit -r 1 -n 100000000 -s "def f(): pass" "f()" Standard : 100000000 loops, best of 1: 0.348 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.341 usec per loop
python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z): pass" "f(1, 2, 3)" Standard : 100000000 loops, best of 1: 0.452 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.451 usec per loop
python.exe -m timeit -r 1 -n 100000000 -s "def f(a = 1, b = 2, c = 3): pass" "f(a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 0.578 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.556 usec per loop
python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z, a = 1, b = 2, c = 3): pass" "f(1, 2, 3, a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 0.761 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.739 usec per loop
python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args): pass" "f(1, 2, 3)" Standard : 100000000 loops, best of 1: 0.511 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.508 usec per loop
python.exe -m timeit -r 1 -n 100000000 -s "def f(**Keys): pass" "f(a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 0.789 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.784 usec per loop
python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f(1, 2, 3, a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 1.01 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 1.01 usec per loop
python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f()" Standard : 100000000 loops, best of 1: 0.393 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.41 usec per loop
I really can't explain it. Ouch!
Looks like a good solution to the problem - no performance loss and a much higher limit on the number of arguments. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 26 2010)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
2010/10/23 M.-A. Lemburg <mal@egenix.com>
I wish we could get rid off all the byte shifting and div'ery use in the byte compiler - I'm pretty sure that such operations are rather slow nowadays compared to working with 16-bit or 32-bit integers and dropping the notion of taking the word "byte" in byte code literally.
Unfortunately we can't remove such shift & masking operations, even on non-byte(code) compilers/VMs. In wpython I handle 16 or 32 bits opcodes (it works on multiple of 16 bits words), but I have: - specialized opcodes to call functions and procedures (functions which trashes the result) which handle the most common cases (84-85% on average from that stats that I have collected from some projects and standard library); I have packed 4 bits nargs and 4 bits nkwargs into a single byte in order to obtain a short (and fast), 16 bits opcode; - big endian systems still need to extract and "rotate" the bytes to get the correct word(s) value. So, even on words (and longwords) representations, they are need. The good thing is that they can be handled a bit fast because oparg stays in one register, and na and nk vars read (and manipulate) it independently, so a (common) out-of-order processor can do a good work, scheduling and parallelize such instructions, leaving a few final dependencies (when recombining shift and/or mask partial results). Some work can also be done reordering the instructions to enhance execution on in-order processors. It may seem strange to have functions, methods or object constructors
with more than 255 parameters, but as I said: when using code generators, the generators don't care whether they use 100 or 300 parameters. Even if just 10 parameters are actually used later on. However, the user will care a lot if the generators fail due such limits and then become unusable.
As example, take a database query method that exposes 3-4 parameters for each query field. In more complex database schemas that you find in e.g. data warehouse applications, it is not uncommon to have 100+ query fields or columns in a data table.
With the current limit in function/call argument counts, such a model could not be mapped directly to Python. Instead, you'd have to turn to solutions based on other data structures that are not automatically checked by Python when calling methods/functions.
-- Marc-Andre Lemburg
I understood the problem, but I don't know if this is the correct solution. Anyway, now there's at least one solution. :) Cesare
2010/10/22 Cesare Di Mauro <cesare.di.mauro@gmail.com>:
I think that having more than 255 arguments for a function call is a very rare case for which a workaround (may be passing a tuple/list or a dictionary) can be a better solution than having to introduce a brand new opcode to handle it.
It does not need a new opcode. The bytecode can create an argument tuple explicitly and pass it like it passes *args. -- Marcin Kowalczyk
2010/10/22 Marcin 'Qrczak' Kowalczyk <qrczak@knm.org.pl>
2010/10/22 Cesare Di Mauro <cesare.di.mauro@gmail.com>:
I think that having more than 255 arguments for a function call is a very rare case for which a workaround (may be passing a tuple/list or a dictionary) can be a better solution than having to introduce a brand new opcode to handle it.
It does not need a new opcode. The bytecode can create an argument tuple explicitly and pass it like it passes *args.
-- Marcin Kowalczyk
It'll be too slow. Current CALL_FUNCTION* uses "packed" ints, not PyLongObject ints. Having a tuple you need (at least) to extract the PyLongs, and convert them to ints, before using them. Cesare
Cesare Di Mauro wrote:
I think that having max 255 args and 255 kwargs is a good and reasonable limit which we can live on, and helps the virtual machine implementation
Is there any corresponding limit to the number of arguments to tuple and dict constructor? If not, the limit could perhaps be circumvented without changing the VM by having the compiler convert calls with large numbers of args into code that builds an appropriate tuple and dict and makes a *args/**kwds call. -- Greg
2010/10/24 Greg Ewing <greg.ewing@canterbury.ac.nz>
Cesare Di Mauro wrote:
I think that having max 255 args and 255 kwargs is a good and reasonable limit which we can live on, and helps the virtual machine implementation
Is there any corresponding limit to the number of arguments to tuple and dict constructor?
AFAIK there's no such limit. However, I'll use BUILD_TUPLE and BUILD_MAP opcodes for such purpose, because they are faster.
If not, the limit could perhaps be circumvented without changing the VM by having the compiler convert calls with large numbers of args into code that builds an appropriate tuple and dict and makes a *args/**kwds call.
-- Greg
I greatly prefer this solution, but it's a bit more complicated when there are *arg and/or **kwargs special arguments. If we have > 255 args and *args is defined, we need to: 1) emit BUILD_TUPLE after pushed the regular arguments 2) emit LOAD_GLOBAL("tuple") 3) push *args 4) emit CALL_FUNCTION(1) to convert *args to a tuple 5) emit BINARY_ADD to append *args to the regular arguments 6) emit CALL_FUNCTION_VAR If we have > 255 kwargs and **kwargs defined, we need to: 1) emit BUILD_MAP after pushed the regular keyword arguments 2) emit LOAD_ATTR("update") 3) push **kwargs 4) emit CALL_FUNCTION(1) to update the regular keyword arguments with the ones in **kwargs 5) emit CALL_FUNCTION_KW And, finally, combining all the above in the worst case. But, as I said, I prefer this one to handle "complex" cases instead of changing the VM slowing the common ones. Cesare
participants (17)
-
Alexander Belopolsky
-
Antoine Pitrou
-
Benjamin Peterson
-
Cameron Simpson
-
Cesare Di Mauro
-
Eric Smith
-
Georg Brandl
-
Greg Ewing
-
Guido van Rossum
-
Lie Ryan
-
M.-A. Lemburg
-
Marcin 'Qrczak' Kowalczyk
-
MRAB
-
Nick Coghlan
-
Raymond Hettinger
-
Tal Einat
-
Terry Reedy