Proposed: The Great Argument Clinic Conversion Derby

Let me start with a summary of the current status of Argument Clinic. It's checked in, it seems to be working fine. As of Friday I've checked in some reasonably complete documentation as a howto: http://docs.python.org/3.4/howto/clinic.html At last, here in beta 2, Argument Clinic is ready for prime time. What about adoption? That's where Argument Clinic has stalled. By my estimate, there are about six hundred places that could be converted to work with Argument Clinic in CPython; as of this writing only a dozen or two have actually been converted. Now, properly converting a function to work with Argument Clinic does not change its behavior. Internally, the code performing argument parsing should be nigh-identical; it should call the same PyArg_Parse function, with the same arguments, and the implementation should perform the same work as a result. The only externally observable change should be that inspect.signature() now produces a valid signature for the builtin; in all other respects Python should be unchanged. No documentation should have to change, no tests should need to be modified, and absolutely no code should be broken as a result. Converting a function to use Argument Clinic should be a blissfully low-risk procedure, and produce a pleasant, easier-to-maintain result. You see where I'm going with this. I am now, reluctantly, proposing that once 3.4.0b2 ships (should be later today), we the Python core development community roll up our collective sleeves and attempt to convert all the builtins to work with Argument Clinic. I call this "The Great Argument Clinic Conversion Derby". The rules of the derby: * The derby stops when RC 1 gets tagged, which should be January 18th. * I'll create issues on the issue tracker for converting each C file. * Participants will take ownership of an issue for a particular file, and have a couple of days to submit a patch. If an issue languishes I reserve the right to reassign it. * I pledge to be highly available and responsive during the derby. * I volunteer to convert posixmodule.c, which is about 60 functions (and therefore 10% of the workload). * I volunteer to review patches until my eyes bleed. I'd prefer to review every single conversion, though it's possible that isn't feasible, not sure. (Though I will have a /lot/ of time I can devote to this.) * I'll create a leader board where contributors are ranked by how many functions they've converted, if people want it, in an endeavor to spark interest and provide some bragging rights. Upsides: * Every builtins we convert is one more builtin with introspection information. It'd be nice to have that in 3.4. * Easier maintenance going forward. Downsides: * Someone could improperly convert a function, which could change the builtin's semantics and break code, and nobody notices and we ship the breakage in 3.4.0 final. I've discussed this with a number of other core developers; so far I've only gotten positive responses. Otherwise I wouldn't propose such madness. (Making changes to 600 different places in the Python tree? What am I thinking?) Keep in mind, this isn't "now or never"; the choice is between "convert now for 3.4" and "wait until after 3.4 final, then convert everything, and it'll ship in 3.5". We'll have this sooner or later--the question is, sooner? or later? What say you? +1? -1e100? Anxiously yours, /arry

On 1/5/2014 11:21 AM, Larry Hastings wrote:
Let me start with a summary of the current status of Argument Clinic. It's checked in, it seems to be working fine. As of Friday I've checked in some reasonably complete documentation as a howto:
http://docs.python.org/3.4/howto/clinic.html
At last, here in beta 2, Argument Clinic is ready for prime time.
What about adoption? That's where Argument Clinic has stalled. By my estimate, there are about six hundred places that could be converted to work with Argument Clinic in CPython; as of this writing only a dozen or two have actually been converted.
Do you remember which? I suggest builtin classes and functions as priorities. ...
You see where I'm going with this. I am now, reluctantly, proposing that once 3.4.0b2 ships (should be later today), we the Python core development community roll up our collective sleeves and attempt to convert all the builtins to work with Argument Clinic.
I will try to speed up my timetable for converting Idle calltips to using inspect.signature instead of the older functions. Does help (pydoc) already do so? -- Terry Jan Reedy

On 01/05/2014 11:49 AM, Terry Reedy wrote:
By my estimate, there are about six hundred places that could be converted to work with Argument Clinic in CPython; as of this writing only a dozen or two have actually been converted. Do you remember which? I suggest builtin classes and functions as
On 1/5/2014 11:21 AM, Larry Hastings wrote: priorities.
I don't, but they're easy to find with UNIX shell tools: fgrep -l clinic */*.c
I will try to speed up my timetable for converting Idle calltips to using inspect.signature instead of the older functions. Does help (pydoc) already do so?
Yes. //arry/

It looks interesting enough. I volunteer to convert at least the audioop, grp, operator, pwd, spw, sre, struct, tkinter modules (audioop already converted, tkinter in progress). If no one will get them, I perhaps will convert the builtins, sys, itertools, functools modules and str, bytes, bytearray, int objects. But I very much upset by the fact that the generated code is written mixed with written manually. It is difficult to navigate (list of symbols now contains three times more names), makes it difficult to read and provokes error (editing the generated code). It would be better if the generated code was written in separate files.

On 01/05/2014 01:49 PM, Serhiy Storchaka wrote:
But I very much upset by the fact that the generated code is written mixed with written manually. It is difficult to navigate (list of symbols now contains three times more names), makes it difficult to read and provokes error (editing the generated code). It would be better if the generated code was written in separate files.
I had that working at one point. Guido said no, keep it all in one file. I'm flexible but first you'd have to convince him. Cheers, //arry/

On 6 Jan 2014 05:54, "Larry Hastings" <larry@hastings.org> wrote:
On 01/05/2014 01:49 PM, Serhiy Storchaka wrote:
But I very much upset by the fact that the generated code is written
mixed with written manually. It is difficult to navigate (list of symbols now contains three times more names), makes it difficult to read and provokes error (editing the generated code). It would be better if the generated code was written in separate files.
I had that working at one point. Guido said no, keep it all in one
file. I'm flexible but first you'd have to convince him. It's also not something we're stuck with forever - we can start with it inline (which has the advantage of keeping all the code in the same place), and later move to having the helpers in a separate file included from the implementation file if we decide it makes sense to do so. This was discussed a fair bit last language summit (and the day after between me, Guido and Larry), and the thing I like about the current approach is that a C coder should be able to understand the generated code *as C code* without needing to know anything about Argument Clinic and without needing to hunt through other files to find where the generated pieces are defined. As Terry noted, even if we just get "help(name)" working properly for the builtins, I'll count that as a major win. Cheers, Nick.
Cheers,
/arry
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

Nick Coghlan <ncoghlan@gmail.com> wrote:
I had that working at one point. Guido said no, keep it all in one file. I'm flexible but first you'd have to convince him.
It's also not something we're stuck with forever - we can start with it inline (which has the advantage of keeping all the code in the same place), and later move to having the helpers in a separate file included from the implementation file if we decide it makes sense to do so.
If we move big chunks of code around twice, I guess "hg blame" will break twice, too. That is another thing worth considering. I agree with Serhiy, but that is probably known at this point. :) Stefan Krah

On Mon, 6 Jan 2014 00:25:53 +0100 Stefan Krah <stefan@bytereef.org> wrote:
Nick Coghlan <ncoghlan@gmail.com> wrote:
I had that working at one point. Guido said no, keep it all in one file. I'm flexible but first you'd have to convince him.
It's also not something we're stuck with forever - we can start with it inline (which has the advantage of keeping all the code in the same place), and later move to having the helpers in a separate file included from the implementation file if we decide it makes sense to do so.
If we move big chunks of code around twice, I guess "hg blame" will break twice, too. That is another thing worth considering.
Breaking on generated code doesn't sound very annoying, though.
I agree with Serhiy, but that is probably known at this point. :)
I agree with Serhiy and you too. Clinic's current output makes C files more tedious to read, and I'm not really willing to participate in the "conversion derby" because of that. What were Guido's arguments? Also, see http://bugs.python.org/issue19723 Regards Antoine.

On Mon, Jan 6, 2014 at 2:17 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I agree with Serhiy, but that is probably known at this point. :)
I agree with Serhiy and you too. Clinic's current output makes C files more tedious to read, and I'm not really willing to participate in the "conversion derby" because of that.
My first thought was that this exercise falls into the realm of fixing things which aren't broken. Skip

On Mon, Jan 6, 2014 at 3:40 PM, Skip Montanaro <skip@pobox.com> wrote:
On Mon, Jan 6, 2014 at 2:17 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I agree with Serhiy, but that is probably known at this point. :)
I agree with Serhiy and you too. Clinic's current output makes C files more tedious to read, and I'm not really willing to participate in the "conversion derby" because of that.
My first thought was that this exercise falls into the realm of fixing things which aren't broken.
The gain in introspection now and possible automated improvements later (e.g. if we come up with a faster way to parse arguments it will automatically propagate through the code base) make it worth it.

On Mon, Jan 6, 2014 at 10:17 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Mon, 6 Jan 2014 00:25:53 +0100 Stefan Krah <stefan@bytereef.org> wrote:
Nick Coghlan <ncoghlan@gmail.com> wrote:
I had that working at one point. Guido said no, keep it all in one file. I'm flexible but first you'd have to convince him.
It's also not something we're stuck with forever - we can start with it inline (which has the advantage of keeping all the code in the same place), and later move to having the helpers in a separate file included from the implementation file if we decide it makes sense to do so.
If we move big chunks of code around twice, I guess "hg blame" will break twice, too. That is another thing worth considering.
Breaking on generated code doesn't sound very annoying, though.
That depends on how stressed you are when you are trying to use hg blame to figure out where a certain breakage was introduced, when and by whom.
I agree with Serhiy, but that is probably known at this point. :)
I agree with Serhiy and you too. Clinic's current output makes C files more tedious to read, and I'm not really willing to participate in the "conversion derby" because of that. What were Guido's arguments?
Also, see http://bugs.python.org/issue19723
I added a hopefully useful suggestion there; ISTM the situation can easily be improved by changing the wording of the magic comments. I'm not yet convinced that the generated code is better off in separate files nor why that is considered such a big deal. And how would you prevent the generated functions from becoming externally visible? As long as they are in the same file they can be static. (I'm not a fan of #include to stitch files together.) -- --Guido van Rossum (python.org/~guido)

On Sun, Jan 5, 2014 at 11:21 AM, Larry Hastings <larry@hastings.org> wrote:
Now, properly converting a function to work with Argument Clinic does not change its behavior. Internally, the code performing argument parsing should be nigh-identical; it should call the same PyArg_Parse function, with the same arguments, and the implementation should perform the same work as a result. The only externally observable change should be that inspect.signature() now produces a valid signature for the builtin; in all other respects Python should be unchanged. No documentation should have to change, no tests should need to be modified, and absolutely no code should be broken as a result. Converting a function to use Argument Clinic should be a blissfully low-risk procedure, and produce a pleasant, easier-to-maintain result.
Hi, If it goes forward I would be willing to help out with the derby on a few modules. I haven't followed the Argument Clinic arguments closely before now, so I don't know if this question has been addressed. I didn't see it mentioned in the docs anywhere, but will the policy be to *prefer* renaming existing functions to the names generated by clinic (the "_impl" names) or to override that to keep the existing names? I ask because some built-in functions are used internally by other built-in functions. I don't know how common this is but, for example, fileio_read calls fileio_readall. So if fileio_readall is renamed to io_FileIO_readall_impl or whatever we need to also go through and fix any references to fileio_readall. Should be easy enough, but I wonder if there are any broader side-effects of this. Might it be safer for the first round to keep the existing function names? Erik

06.01.14 22:53, Erik Bray написав(ла):
I ask because some built-in functions are used internally by other built-in functions. I don't know how common this is but, for example, fileio_read calls fileio_readall. So if fileio_readall is renamed to io_FileIO_readall_impl or whatever we need to also go through and fix any references to fileio_readall. Should be easy enough, but I wonder if there are any broader side-effects of this. Might it be safer for the first round to keep the existing function names?
You can left fileio_readall as is and call it from io_FileIO_readall_impl and other places.
participants (10)
-
Antoine Pitrou
-
Brett Cannon
-
Erik Bray
-
Guido van Rossum
-
Larry Hastings
-
Nick Coghlan
-
Serhiy Storchaka
-
Skip Montanaro
-
Stefan Krah
-
Terry Reedy