IO implementation: in C and Python?
As we prepare to merge the io-c branch, the question has come up [1] about the original Python implementation. Should it just be deleted in favor C version? The wish to maintain the two implementations together has been raised on the basis that Python is easier to experiment on and read (for other vm implementors). Thoughts? http://bugs.python.org/issue4565 -- Regards, Benjamin
On Thu, 19 Feb 2009 at 21:41, Benjamin Peterson wrote:
As we prepare to merge the io-c branch, the question has come up [1] about the original Python implementation. Should it just be deleted in favor C version? The wish to maintain the two implementations together has been raised on the basis that Python is easier to experiment on and read (for other vm implementors).
I'm personally +0 on this, but I note that it is easier to read not just for other vm implementors, but for users. Witness the question about the behavior of 'for' vs 'readline'. I'd have had a much harder time figuring out the behavior if I'd had to read the C code. That said, I'm not personally sure if maintaining both versions is worth it. Real python developers should make that decision :) --RDM
On Thu, Feb 19, 2009 at 19:41, Benjamin Peterson <benjamin@python.org>wrote:
As we prepare to merge the io-c branch, the question has come up [1] about the original Python implementation. Should it just be deleted in favor C version? The wish to maintain the two implementations together has been raised on the basis that Python is easier to experiment on and read (for other vm implementors).
Probably not a surprise, but +1 from me for keeping the pure Python version around for the benefit of other VMs as well as a reference implementation. -Brett
On Thu, Feb 19, 2009 at 8:38 PM, Brett Cannon <brett@python.org> wrote:
On Thu, Feb 19, 2009 at 19:41, Benjamin Peterson <benjamin@python.org> wrote:
As we prepare to merge the io-c branch, the question has come up [1] about the original Python implementation. Should it just be deleted in favor C version? The wish to maintain the two implementations together has been raised on the basis that Python is easier to experiment on and read (for other vm implementors).
Probably not a surprise, but +1 from me for keeping the pure Python version around for the benefit of other VMs as well as a reference implementation.
You have been practice channeling me again, haven't you? I like the idea of having two (closely matching) implementations very much. In 2.x we did this on an ad-hoc basis, e.g. [c]StringIO, pickle/cPickle, heapq/_heapq. In 3.0 we've moved towards standardizing the approach -- the foo.py file first defines everything and then tries to import * from _foo on top of that. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
On Thu, Feb 19, 2009 at 9:07 PM, Guido van Rossum <guido@python.org> wrote:
On Thu, Feb 19, 2009 at 8:38 PM, Brett Cannon <brett@python.org> wrote:
On Thu, Feb 19, 2009 at 19:41, Benjamin Peterson <benjamin@python.org> wrote:
As we prepare to merge the io-c branch, the question has come up [1] about the original Python implementation. Should it just be deleted in favor C version? The wish to maintain the two implementations together has been raised on the basis that Python is easier to experiment on and read (for other vm implementors).
Probably not a surprise, but +1 from me for keeping the pure Python version around for the benefit of other VMs as well as a reference implementation.
You have been practice channeling me again, haven't you? I like the idea of having two (closely matching) implementations very much.
Agreed. In particular, this helps any projects that are focused on improving the performance of pure-Python code: they can work on minimizing the delta between the Python and C versions. Collin
Guido van Rossum wrote:
On Thu, Feb 19, 2009 at 8:38 PM, Brett Cannon <brett@python.org> wrote:
On Thu, Feb 19, 2009 at 19:41, Benjamin Peterson <benjamin@python.org> wrote:
As we prepare to merge the io-c branch, the question has come up [1] about the original Python implementation. Should it just be deleted in favor C version? The wish to maintain the two implementations together has been raised on the basis that Python is easier to experiment on and read (for other vm implementors). Probably not a surprise, but +1 from me for keeping the pure Python version around for the benefit of other VMs as well as a reference implementation.
You have been practice channeling me again, haven't you? I like the idea of having two (closely matching) implementations very much. In 2.x we did this on an ad-hoc basis, e.g. [c]StringIO, pickle/cPickle, heapq/_heapq. In 3.0 we've moved towards standardizing the approach -- the foo.py file first defines everything and then tries to import * from _foo on top of that.
Currently, if I want to verify that (say) cFoo and Foo do the same thing, or compare their speed, it's easy because I can import the modules separately. Given the 3.0 approach, how would one access the Python versions without black magic or hacks? -- Steven
On Thu, Feb 19, 2009 at 21:35, Steven D'Aprano <steve@pearwood.info> wrote:
Guido van Rossum wrote:
On Thu, Feb 19, 2009 at 8:38 PM, Brett Cannon <brett@python.org> wrote:
On Thu, Feb 19, 2009 at 19:41, Benjamin Peterson <benjamin@python.org> wrote:
As we prepare to merge the io-c branch, the question has come up [1] about the original Python implementation. Should it just be deleted in favor C version? The wish to maintain the two implementations together has been raised on the basis that Python is easier to experiment on and read (for other vm implementors).
Probably not a surprise, but +1 from me for keeping the pure Python version around for the benefit of other VMs as well as a reference implementation.
You have been practice channeling me again, haven't you? I like the idea of having two (closely matching) implementations very much. In 2.x we did this on an ad-hoc basis, e.g. [c]StringIO, pickle/cPickle, heapq/_heapq. In 3.0 we've moved towards standardizing the approach -- the foo.py file first defines everything and then tries to import * from _foo on top of that.
Currently, if I want to verify that (say) cFoo and Foo do the same thing, or compare their speed, it's easy because I can import the modules separately. Given the 3.0 approach, how would one access the Python versions without black magic or hacks?
As of right now there is no standard practice, although coming up with a standard way of handling this would probably be a good thing as this will also help with the testing story. -Brett
On Fri, Feb 20, 2009 at 12:35 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Currently, if I want to verify that (say) cFoo and Foo do the same thing, or compare their speed, it's easy because I can import the modules separately. Given the 3.0 approach, how would one access the Python versions without black magic or hacks?
My prefered way to handle this is to keep the original Python implementations with a leading underscore (e.g., pickle._Pickler). I found this was the easiest way to test the C and Python implementations without resorting to import hacks. -- Alexandre
Steven D'Aprano wrote:
Currently, if I want to verify that (say) cFoo and Foo do the same thing, or compare their speed, it's easy because I can import the modules separately.
Also, won't foo.py be wasting time in most cases by defining python versions that get overwritten? Instead of defining things directly in foo.py, maybe it should do try: from cFoo import * except ImportError: from pyFoo import * Then the fast path will be taken if cFoo is available, and you can directly import cFoo or pyFoo if you want. -- Greg
On Fri, Feb 20, 2009 at 13:15, Greg Ewing <greg.ewing@canterbury.ac.nz>wrote:
Steven D'Aprano wrote:
Currently, if I want to verify that (say) cFoo and Foo do the same thing,
or compare their speed, it's easy because I can import the modules separately.
Also, won't foo.py be wasting time in most cases by defining python versions that get overwritten?
But that's only at import time and that is rather minor compared to other execution costs.
Instead of defining things directly in foo.py, maybe it should do
try: from cFoo import * except ImportError: from pyFoo import *
Then the fast path will be taken if cFoo is available, and you can directly import cFoo or pyFoo if you want.
See the other thread I started on discussing best practices for this, but this won't work for modules where only part of the implementation has an optimized version in an extension module (e.g. pickle). -Brett
Greg Ewing wrote:
Instead of defining things directly in foo.py, maybe it should do
try: from cFoo import * except ImportError: from pyFoo import *
Then the fast path will be taken if cFoo is available, and you can directly import cFoo or pyFoo if you want.
For what it's worth, I like that naming convention better than the current conventions Foo/cFoo, Foo/_Foo. -- Steven
Benjamin Peterson <benjamin <at> python.org> writes:
As we prepare to merge the io-c branch, the question has come up [1] about the original Python implementation. Should it just be deleted in favor C version? The wish to maintain the two implementations together has been raised on the basis that Python is easier to experiment on and read (for other vm implementors).
As I said, if it is the way forward, I suggest putting the Python version in a separate module (e.g. pyio.py), for the sake of clarity and also because it may slightly improve startup times (the pure-Python module wouldn't get imported in normal conditions). Your thoughts? Regards Antoine.
Antoine Pitrou schrieb:
Benjamin Peterson <benjamin <at> python.org> writes:
As we prepare to merge the io-c branch, the question has come up [1] about the original Python implementation. Should it just be deleted in favor C version? The wish to maintain the two implementations together has been raised on the basis that Python is easier to experiment on and read (for other vm implementors).
As I said, if it is the way forward, I suggest putting the Python version in a separate module (e.g. pyio.py), for the sake of clarity and also because it may slightly improve startup times (the pure-Python module wouldn't get imported in normal conditions).
Your thoughts?
I just hope everyone updates both versions when making changes to IO. This is somewhat of a non-problem for small modules like bisect, or heapq. For pickle and StringIO, we already saw how not to do it in 2.x -- hopefully the new _pickle and pickle modules stay compatible. IO is much larger a piece of code... Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
Georg Brandl <g.brandl <at> gmx.net> writes:
I just hope everyone updates both versions when making changes to IO.
My proposal is just organizational, it is neutral in terms of whether or not the Python version is correctly maintained. We can hope that the IO lib *semantics* won't change too much in the future (although there is an IMO legitimate request for a setblocking() method: http://bugs.python.org/issue949667). On the other hand, I don't expect anyone to willingly use the Python version if the C version is available.
Antoine Pitrou wrote:
Georg Brandl <g.brandl <at> gmx.net> writes:
I just hope everyone updates both versions when making changes to IO.
My proposal is just organizational, it is neutral in terms of whether or not the Python version is correctly maintained.
We can hope that the IO lib *semantics* won't change too much in the future (although there is an IMO legitimate request for a setblocking() method: http://bugs.python.org/issue949667). On the other hand, I don't expect anyone to willingly use the Python version if the C version is available.
If they're functionally equivalent and single set of tests is run on both then -- assuming good tests -- breakage would be noticed... Michael
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.u...
On Fri, Feb 20, 2009 at 4:01 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Georg Brandl <g.brandl <at> gmx.net> writes:
I just hope everyone updates both versions when making changes to IO.
My proposal is just organizational, it is neutral in terms of whether or not the Python version is correctly maintained.
I worry that with your proposal people are once again going to import the pure Python version where they shouldn't. Maybe _pyio.py would work though?
We can hope that the IO lib *semantics* won't change too much in the future (although there is an IMO legitimate request for a setblocking() method: http://bugs.python.org/issue949667). On the other hand, I don't expect anyone to willingly use the Python version if the C version is available.
Hoping that modules won't evolve is futile. The concern for divergence is real. Unit-testing both with the same tests might be the solution. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum <guido <at> python.org> writes:
I worry that with your proposal people are once again going to import the pure Python version where they shouldn't. Maybe _pyio.py would work though?
I'm ok with _pyio.py.
Hoping that modules won't evolve is futile. The concern for divergence is real. Unit-testing both with the same tests might be the solution.
Yes, the same tests should be applied to both (modulo the few ones that test for implementation-specific behaviour, e.g. the max_buffer_size stuff). Regards Antoine.
Guido van Rossum wrote:
On Fri, Feb 20, 2009 at 4:01 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Georg Brandl <g.brandl <at> gmx.net> writes:
I just hope everyone updates both versions when making changes to IO. My proposal is just organizational, it is neutral in terms of whether or not the Python version is correctly maintained.
I worry that with your proposal people are once again going to import the pure Python version where they shouldn't. Maybe _pyio.py would work though?
From a user perspective, continuity of 'import xyz' importing the currently best implementation is what is important, even if that switches back and forth.
We can hope that the IO lib *semantics* won't change too much in the future (although there is an IMO legitimate request for a setblocking() method: http://bugs.python.org/issue949667). On the other hand, I don't expect anyone to willingly use the Python version if the C version is available.
Hoping that modules won't evolve is futile. The concern for divergence is real. Unit-testing both with the same tests might be the solution.
It seems to me that starting new features with a new test and prototyping in the Python version should mostly avoid the problem. Keeping the Python version allows non-C Pythoneers to contribute to such efforts. (As opposed to fixing a C-only bug.) If the Python version is ahead at the time of a release, the Python version could be reverted to being a master version that import the C version for most but not all functions. tjr
On Thu, 19 Feb 2009 21:41:51 -0600, Benjamin Peterson wrote:
As we prepare to merge the io-c branch, the question has come up [1] about the original Python implementation. Should it just be deleted in favor C version? The wish to maintain the two implementations together has been raised on the basis that Python is easier to experiment on and read (for other vm implementors).
Thoughts?
How about making it an optional module instead, a compile flag when compiling python would determine whether the python or C or both versions of the libraries would be included with C-only as the default. Alternatively, if the compile flag was turned off and you want access to the python version, provide a downloadable pure python library (OS package manager could have something like python-lib-purepy or something similar). This would streamline python, and only people who want to mess around would download the purepy version.
participants (13)
-
Alexandre Vassalotti -
Antoine Pitrou -
Benjamin Peterson -
Brett Cannon -
Collin Winter -
Georg Brandl -
Greg Ewing -
Guido van Rossum -
Lie Ryan -
Michael Foord -
rdmurray@bitdance.com -
Steven D'Aprano -
Terry Reedy