Hi Pauli, It looks like you doing great stuff with the py3k transition. Do you and David have any sort of merge schedule in mind? Chuck
On Tue, Dec 1, 2009 at 5:04 AM, Charles R Harris
Hi Pauli,
It looks like you doing great stuff with the py3k transition. Do you and David have any sort of merge schedule in mind?
I have updated my py3k branch for numpy.distutils, and it is ready to merge: http://github.com/cournape/numpy/tree/py3k_bootstrap_take3 I have not thoroughly tested it, but it can run on both 2.4 and 3.1 on Linux at least. The patch is much smaller than my previous attempts as well, so I would just push it to the trunk, and deal with the issues as they come. cheers, David
Tue, 01 Dec 2009 17:31:10 +0900, David Cournapeau wrote:
On Tue, Dec 1, 2009 at 5:04 AM, Charles R Harris
wrote: It looks like you doing great stuff with the py3k transition. Do you and David have any sort of merge schedule in mind?
I have updated my py3k branch for numpy.distutils, and it is ready to merge:
http://github.com/cournape/numpy/tree/py3k_bootstrap_take3
I have not thoroughly tested it, but it can run on both 2.4 and 3.1 on Linux at least. The patch is much smaller than my previous attempts as well, so I would just push it to the trunk, and deal with the issues as they come.
I think I should rebase my branch on this, or vice versa, to avoid further duplicated work. I think most of my changes would be ready for SVN, after rebasing and regrouping via rebase -i -- they do not affect behavior on Py2, and for the most part the changes required in C code are quite obvious. The largest changes are probably related to the 'S' data type. In other news, we cannot support Py2 pickles in Py3 -- this is because Py2 str is unpickled as Py3 str, resulting to encoding failures even before the data is passed on to Numpy. But in any case, Py3 support for Numpy 1.5.0 seems a completely realistic plan. -- Pauli Virtanen
Pauli Virtanen wrote:
I think I should rebase my branch on this, or vice versa, to avoid further duplicated work.
I think I will just commit my branch to the trunk once ASAP - I expect more breakage from my code than yours, and the sooner the better for distutils-related changes. cheers, David
Thu, 03 Dec 2009 18:23:28 +0900, David Cournapeau wrote:
Pauli Virtanen wrote:
I think I should rebase my branch on this, or vice versa, to avoid further duplicated work.
I think I will just commit my branch to the trunk once ASAP - I expect more breakage from my code than yours, and the sooner the better for distutils-related changes.
Ok, I'll follow that up with the more innocuous changesets: 1) Stuff needed to make Numpy C modules to build on Py3K 2) The evil 2to3 autoconversion hack, which we may want to get rid of in the long run. 3) PEP 3118 4) Obvious PyBytes vs. PyUnicode changes I'll try to get this done ASAP, too, after the distutils stuff is in. -- Pauli Virtanen
Pauli Virtanen wrote:
Thu, 03 Dec 2009 18:23:28 +0900, David Cournapeau wrote:
Pauli Virtanen wrote:
I think I should rebase my branch on this, or vice versa, to avoid further duplicated work.
I think I will just commit my branch to the trunk once ASAP - I expect more breakage from my code than yours, and the sooner the better for distutils-related changes.
Ok, I'll follow that up with the more innocuous changesets
Ok, the patch is being commited to the trunk right now. I have not thoroughly tested it, but at least python 2.4/2.6/3.1 all build (up to the first build failure for 3.1 of course), without the need to apply 2to3 to numpy.distutils. Also, do not forget the NPY_SEPARATE_COMPILATION option - it makes an appreciable difference when compiling partial build :) cheers, David
On Thu, Dec 3, 2009 at 10:39 AM, Pauli Virtanen
wrote:
Tue, 01 Dec 2009 17:31:10 +0900, David Cournapeau wrote:
On Tue, Dec 1, 2009 at 5:04 AM, Charles R Harris
wrote: It looks like you doing great stuff with the py3k transition. Do you and David have any sort of merge schedule in mind?
I have updated my py3k branch for numpy.distutils, and it is ready to merge:
http://github.com/cournape/numpy/tree/py3k_bootstrap_take3
I have not thoroughly tested it, but it can run on both 2.4 and 3.1 on Linux at least. The patch is much smaller than my previous attempts as well, so I would just push it to the trunk, and deal with the issues as they come.
I think I should rebase my branch on this, or vice versa, to avoid further duplicated work.
I think most of my changes would be ready for SVN, after rebasing and regrouping via rebase -i -- they do not affect behavior on Py2, and for the most part the changes required in C code are quite obvious.
The largest changes are probably related to the 'S' data type.
In other news, we cannot support Py2 pickles in Py3 -- this is because Py2 str is unpickled as Py3 str, resulting to encoding failures even before the data is passed on to Numpy.
Is this just for the type codes? Or is there other string data that needs to be pickle loaded? If it is just for the type codes, they are all within the ansi character set and unpickle fine without errors. I'm guessing numpy uses strings to pickle arrays? Note that the pickle module is extensible. So we might be able to get it to special case things. You can subclass Unpickler to make extensions... and there are other techniques. Or it's even possible to submit patches to python if we have a need for something it doesn't support. It is even possible to change the pickle code for py2, so that py3 compatible pickles are saved. In this case it would just require people to load, and resave their pickles with the latest numpy version. Using the python array module to store data might be the way to go(rather than strings), since that is available in both py2 and py3. The pickling/unpickling situtation should be marked as a todo, and documented anyway. As we should start a numpy specific 'porting your code to py3k' document. A set of pickles saved from python2 would be useful for testing. Forwards compatibility is also a useful thing to test. That is py3.1 pickles saved to be loaded with python2 numpy. cheers!
to, 2009-12-03 kello 13:04 +0100, René Dudfield kirjoitti: [clip]
In other news, we cannot support Py2 pickles in Py3 -- this is because Py2 str is unpickled as Py3 str, resulting to encoding failures even before the data is passed on to Numpy.
Is this just for the type codes? Or is there other string data that needs to be pickle loaded? If it is just for the type codes, they are all within the ansi character set and unpickle fine without errors. I'm guessing numpy uses strings to pickle arrays?
The array data is put in a string in __reduce__. The dtype is IIRC mostly stored using integers, though endianness is stored with a character. Actually, now that I look more closely, Py3 pickle.load takes an 'encoding' argument, which will perhaps help here. We should probably just instruct users to pass 'latin1' there in Py3 if they want backwards compatibility. The Numpy __reduce__ and __setstate__ C code must then just be checked for compatibility. [clip]
Using the python array module to store data might be the way to go(rather than strings), since that is available in both py2 and py3.
The array module has the same problem as Numpy, so using it will not help: $ python Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
import array c = array.array('b', '123öä') c array('b', [49, 50, 51, -61, -74, -61, -92]) f = open('foo.pck', 'w'); pickle.dump(c, f); f.close() $ python3 Python 3.0.1+ (r301:69556, Apr 15 2009, 15:59:22) import pickle f = open('foo.pck', 'rb') pickle.load(f) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.0/pickle.py", line 1335, in load return Unpickler(file, encoding=encoding, errors=errors).load() UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)
The 'encoding' argument does not actually help array module, but that may be just because of some incompatible __setstate__ stuff in 'array'. [clip]
A set of pickles saved from python2 would be useful for testing. Forwards compatibility is also a useful thing to test. That is py3.1 pickles saved to be loaded with python2 numpy.
In Py3 it would be very convenient to __getstate__ the array data in Bytes (e.g. space savings!), which will be forward incompatible, unless the Py2 side has a custom unpickler. -- Pauli Virtanen
participants (6)
-
Charles R Harris
-
David Cournapeau
-
David Cournapeau
-
Pauli Virtanen
-
Pauli Virtanen
-
René Dudfield