pickle numpy array from pypy to cpython?
I'm trying to construct some data that includes numpy arrays in pypy, pickle it, then unpickle it in cpython (to use some non-pypy-compatible libs). However, the actual class of the pickled array is _numpypy.multiarray, which cpython doesn't have. Any suggestions? Thanks, Eli
On 24 June 2016 at 12:14, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
I'm trying to construct some data that includes numpy arrays in pypy, pickle it, then unpickle it in cpython (to use some non-pypy-compatible libs).
However, the actual class of the pickled array is _numpypy.multiarray, which cpython doesn't have.
Any suggestions?
Have you considered the tofile method and fromfile function? http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tofile.htm... http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html#nump... -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement.
No, since it's not *just* a numpy array I need to move around (dict with numpy values, in this case, more complicated objects in the future). Obviously I can kludge something manual together (assuming the tofile/fromfile functions work cross-interpreter, which I wouldn't take for granted at this point), but I'd rather be able to use pickle (easier to work with libraries that also expect pickles, etc.). Eli On Thu, Jun 23, 2016 at 7:54 PM, William ML Leslie <william.leslie.ttg@gmail.com> wrote:
On 24 June 2016 at 12:14, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
I'm trying to construct some data that includes numpy arrays in pypy, pickle it, then unpickle it in cpython (to use some non-pypy-compatible libs).
However, the actual class of the pickled array is _numpypy.multiarray, which cpython doesn't have.
Any suggestions?
Have you considered the tofile method and fromfile function?
http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tofile.htm...
http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html#nump...
-- William Leslie
Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement.
Last time I tried tofile and fromfile in Numpypy it was not implemented. On Fri, Jun 24, 2016 at 7:01 AM, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
No, since it's not *just* a numpy array I need to move around (dict with numpy values, in this case, more complicated objects in the future). Obviously I can kludge something manual together (assuming the tofile/fromfile functions work cross-interpreter, which I wouldn't take for granted at this point), but I'd rather be able to use pickle (easier to work with libraries that also expect pickles, etc.).
Eli
On Thu, Jun 23, 2016 at 7:54 PM, William ML Leslie <william.leslie.ttg@gmail.com> wrote:
On 24 June 2016 at 12:14, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
I'm trying to construct some data that includes numpy arrays in pypy, pickle it, then unpickle it in cpython (to use some non-pypy-compatible libs).
However, the actual class of the pickled array is _numpypy.multiarray, which cpython doesn't have.
Any suggestions?
Have you considered the tofile method and fromfile function?
http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tofile.htm...
http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html#nump...
-- William Leslie
Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement.
_______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
Yeah, looks like that's still the case:
z = np.zeros((2,3), dtype=np.float32) z.tofile Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'numpy.ndarray' object has no attribute 'tofile'
What would it take to get cross-interpreter numpy array pickles working? Thanks, Eli On Thu, Jun 23, 2016 at 10:14 PM, David Brochart <david.brochart@gmail.com> wrote:
Last time I tried tofile and fromfile in Numpypy it was not implemented.
On Fri, Jun 24, 2016 at 7:01 AM, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
No, since it's not *just* a numpy array I need to move around (dict with numpy values, in this case, more complicated objects in the future). Obviously I can kludge something manual together (assuming the tofile/fromfile functions work cross-interpreter, which I wouldn't take for granted at this point), but I'd rather be able to use pickle (easier to work with libraries that also expect pickles, etc.).
Eli
On Thu, Jun 23, 2016 at 7:54 PM, William ML Leslie <william.leslie.ttg@gmail.com> wrote:
On 24 June 2016 at 12:14, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
I'm trying to construct some data that includes numpy arrays in pypy, pickle it, then unpickle it in cpython (to use some non-pypy-compatible libs).
However, the actual class of the pickled array is _numpypy.multiarray, which cpython doesn't have.
Any suggestions?
Have you considered the tofile method and fromfile function?
http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tofile.htm...
http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html#nump...
-- William Leslie
Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement.
pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
The first step would be to pickle the same dtype/shape/data ndarray once from numpy and again from _numpypy, and to compare the binary result. The only difference should be the class name, if the difference goes deeper that difference must be fixed. Then it it just a matter of patching pickle.py to use the desired class instead of the class name encoded into the pickled binary result. Matti
On 24 Jun 2016, at 10:43 PM, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Yeah, looks like that's still the case:
z = np.zeros((2,3), dtype=np.float32) z.tofile Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'numpy.ndarray' object has no attribute 'tofile'
What would it take to get cross-interpreter numpy array pickles working?
Thanks, Eli
Doesn't look like they are exactly the same: https://gist.github.com/elistevens/03e22f4684fb77d3edfe13ffcd406ef4 Certainly some similarities, though. I'm not familiar with the pickle format, and I haven't yet had time to dig in beyond this, though. Hoping I can tonight. Cheers, Eli On Fri, Jun 24, 2016 at 1:21 PM, matti picus <matti.picus@gmail.com> wrote:
The first step would be to pickle the same dtype/shape/data ndarray once from numpy and again from _numpypy, and to compare the binary result. The only difference should be the class name, if the difference goes deeper that difference must be fixed. Then it it just a matter of patching pickle.py to use the desired class instead of the class name encoded into the pickled binary result. Matti
On 24 Jun 2016, at 10:43 PM, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Yeah, looks like that's still the case:
z = np.zeros((2,3), dtype=np.float32) z.tofile Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'numpy.ndarray' object has no attribute 'tofile'
What would it take to get cross-interpreter numpy array pickles working?
Thanks, Eli
Okay, if I pass the pickles through pickletools.optimize, they look identical except for the very first line (and a resulting systematic shift in offset):
pt.dis(pt.optimize(open('cp123.pkl').read())) 0: c GLOBAL 'numpy.core.multiarray _reconstruct'
pt.dis(pt.optimize(open('pp123.pkl').read())) 0: c GLOBAL '_numpypy.multiarray _reconstruct'
So I suspect that simply lying about what class we just pickled would do the trick. I have no idea how acceptable that would be as a general solution, though. Thoughts? Eli On Fri, Jun 24, 2016 at 2:29 PM, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Doesn't look like they are exactly the same:
https://gist.github.com/elistevens/03e22f4684fb77d3edfe13ffcd406ef4
Certainly some similarities, though.
I'm not familiar with the pickle format, and I haven't yet had time to dig in beyond this, though. Hoping I can tonight.
Cheers, Eli
On Fri, Jun 24, 2016 at 1:21 PM, matti picus <matti.picus@gmail.com> wrote:
The first step would be to pickle the same dtype/shape/data ndarray once from numpy and again from _numpypy, and to compare the binary result. The only difference should be the class name, if the difference goes deeper that difference must be fixed. Then it it just a matter of patching pickle.py to use the desired class instead of the class name encoded into the pickled binary result. Matti
On 24 Jun 2016, at 10:43 PM, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Yeah, looks like that's still the case:
> z = np.zeros((2,3), dtype=np.float32) > z.tofile Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'numpy.ndarray' object has no attribute 'tofile'
What would it take to get cross-interpreter numpy array pickles working?
Thanks, Eli
Heh, interestingly, if I add the following to the local dir and files when trying to unpickle under cpython, it works (note that cpython to pypy actually works out of the box, which I hadn't realized): $ cat _numpypy/__init__.py from numpy.core import * $ cat _numpypy/multiarray.py from numpy.core.multiarray import * import numpy.core.multiarray as _ncm _reconstruct = _ncm._reconstruct This is obviously a total hack, and not one I'm comfortable with (since I need to use this codebase from both cpython and pypy), but it demonstrates that it's just bookkeeping that needs to change to get things to work. My first approach would be to add a wrapper around save_global here https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 that special-cases the global '_numpypy.multiarray' to instead be 'numpy.core.multiarray'. That seem like a reasonable thing to do? Cheers, Eli On Fri, Jun 24, 2016 at 5:46 PM, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Okay, if I pass the pickles through pickletools.optimize, they look identical except for the very first line (and a resulting systematic shift in offset):
pt.dis(pt.optimize(open('cp123.pkl').read())) 0: c GLOBAL 'numpy.core.multiarray _reconstruct'
pt.dis(pt.optimize(open('pp123.pkl').read())) 0: c GLOBAL '_numpypy.multiarray _reconstruct'
So I suspect that simply lying about what class we just pickled would do the trick.
I have no idea how acceptable that would be as a general solution, though. Thoughts?
Eli
On Fri, Jun 24, 2016 at 2:29 PM, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Doesn't look like they are exactly the same:
https://gist.github.com/elistevens/03e22f4684fb77d3edfe13ffcd406ef4
Certainly some similarities, though.
I'm not familiar with the pickle format, and I haven't yet had time to dig in beyond this, though. Hoping I can tonight.
Cheers, Eli
On Fri, Jun 24, 2016 at 1:21 PM, matti picus <matti.picus@gmail.com> wrote:
The first step would be to pickle the same dtype/shape/data ndarray once from numpy and again from _numpypy, and to compare the binary result. The only difference should be the class name, if the difference goes deeper that difference must be fixed. Then it it just a matter of patching pickle.py to use the desired class instead of the class name encoded into the pickled binary result. Matti
On 24 Jun 2016, at 10:43 PM, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Yeah, looks like that's still the case:
>> z = np.zeros((2,3), dtype=np.float32) >> z.tofile Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'numpy.ndarray' object has no attribute 'tofile'
What would it take to get cross-interpreter numpy array pickles working?
Thanks, Eli
Sounds reasonable. You might want to generalize it a bit by trying to import _numpypy /numpy, and setting up the replacement by whichever fails to import. Matti On Saturday, 25 June 2016, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Heh, interestingly, if I add the following to the local dir and files when trying to unpickle under cpython, it works (note that cpython to pypy actually works out of the box, which I hadn't realized):
$ cat _numpypy/__init__.py from numpy.core import *
$ cat _numpypy/multiarray.py from numpy.core.multiarray import * import numpy.core.multiarray as _ncm _reconstruct = _ncm._reconstruct
This is obviously a total hack, and not one I'm comfortable with (since I need to use this codebase from both cpython and pypy), but it demonstrates that it's just bookkeeping that needs to change to get things to work.
My first approach would be to add a wrapper around save_global here
https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 that special-cases the global '_numpypy.multiarray' to instead be 'numpy.core.multiarray'. That seem like a reasonable thing to do?
Cheers, Eli
I was thinking about doing it on import of the micronumpy module (pypy/module/micronumpy/app_numpy.py). Right now, when I try and import pickle during the tests: $ cat pypy/module/micronumpy/test/test_pickling_app.py import sys import py from pypy.module.micronumpy.test.test_base import BaseNumpyAppTest from pypy.conftest import option class AppTestPicklingNumpy(BaseNumpyAppTest): def setup_class(cls): if option.runappdirect and '__pypy__' not in sys.builtin_module_names: py.test.skip("pypy only test") BaseNumpyAppTest.setup_class.im_func(cls) def test_pickle_module(self): import pickle ... # more code I get this error:
import struct
lib-python/2.7/pickle.py:34: _ _ _ _ _ _
from _struct import * E (application-level) ImportError: No module named _struct
lib-python/2.7/struct.py:1: ImportError But everything seems fine with struct: $ ./pytest.py pypy/module/struct/test/test_struct.py ==== test session starts ==== platform linux2 -- Python 2.7.11 -- py-1.4.20 -- pytest-2.5.2 pytest-2.5.2 from /home/elis/edit/play/pypy/pytest.pyc collected 30 items pypy/module/struct/test/test_struct.py .............................. ==== 30 passed in 11.95 seconds ==== Any idea what's going on here? Thanks, Eli On Fri, Jun 24, 2016 at 9:19 PM, matti picus <matti.picus@gmail.com> wrote:
Sounds reasonable. You might want to generalize it a bit by trying to import _numpypy /numpy, and setting up the replacement by whichever fails to import. Matti
On Saturday, 25 June 2016, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Heh, interestingly, if I add the following to the local dir and files when trying to unpickle under cpython, it works (note that cpython to pypy actually works out of the box, which I hadn't realized):
$ cat _numpypy/__init__.py from numpy.core import *
$ cat _numpypy/multiarray.py from numpy.core.multiarray import * import numpy.core.multiarray as _ncm _reconstruct = _ncm._reconstruct
This is obviously a total hack, and not one I'm comfortable with (since I need to use this codebase from both cpython and pypy), but it demonstrates that it's just bookkeeping that needs to change to get things to work.
My first approach would be to add a wrapper around save_global here
https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 that special-cases the global '_numpypy.multiarray' to instead be 'numpy.core.multiarray'. That seem like a reasonable thing to do?
Cheers, Eli
You need to add the modules to those that the class-local space is built with using a spaceconfig, so something like class AppTestPicklingNumpy(BaseNumpyAppTest): spaceconfig = dict(usemodules=["micronumpy", "struct", "binascii"]) def setup_class(cls): if option.runappdirect and '__pypy__' not in sys.builtin_module_names: py.test.skip("pypy only test") BaseNumpyAppTest.setup_class.im_func(cls) def test_pickle_module(self): import pickle On 25/06/16 09:01, Eli Stevens (Gmail) wrote:
I was thinking about doing it on import of the micronumpy module (pypy/module/micronumpy/app_numpy.py).
Right now, when I try and import pickle during the tests:
$ cat pypy/module/micronumpy/test/test_pickling_app.py import sys import py
from pypy.module.micronumpy.test.test_base import BaseNumpyAppTest from pypy.conftest import option
class AppTestPicklingNumpy(BaseNumpyAppTest): def setup_class(cls): if option.runappdirect and '__pypy__' not in sys.builtin_module_names: py.test.skip("pypy only test") BaseNumpyAppTest.setup_class.im_func(cls)
def test_pickle_module(self): import pickle ... # more code
I get this error:
import struct lib-python/2.7/pickle.py:34:
from _struct import * E (application-level) ImportError: No module named _struct
lib-python/2.7/struct.py:1: ImportError
But everything seems fine with struct:
$ ./pytest.py pypy/module/struct/test/test_struct.py ==== test session starts ==== platform linux2 -- Python 2.7.11 -- py-1.4.20 -- pytest-2.5.2 pytest-2.5.2 from /home/elis/edit/play/pypy/pytest.pyc collected 30 items
pypy/module/struct/test/test_struct.py ..............................
==== 30 passed in 11.95 seconds ====
Any idea what's going on here?
Thanks, Eli
On Fri, Jun 24, 2016 at 9:19 PM, matti picus <matti.picus@gmail.com> wrote:
Sounds reasonable. You might want to generalize it a bit by trying to import _numpypy /numpy, and setting up the replacement by whichever fails to import. Matti
On Saturday, 25 June 2016, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Heh, interestingly, if I add the following to the local dir and files when trying to unpickle under cpython, it works (note that cpython to pypy actually works out of the box, which I hadn't realized):
$ cat _numpypy/__init__.py from numpy.core import *
$ cat _numpypy/multiarray.py from numpy.core.multiarray import * import numpy.core.multiarray as _ncm _reconstruct = _ncm._reconstruct
This is obviously a total hack, and not one I'm comfortable with (since I need to use this codebase from both cpython and pypy), but it demonstrates that it's just bookkeeping that needs to change to get things to work.
My first approach would be to add a wrapper around save_global here
https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 that special-cases the global '_numpypy.multiarray' to instead be 'numpy.core.multiarray'. That seem like a reasonable thing to do?
Cheers, Eli
That did the trick. Pull request here: https://bitbucket.org/pypy/pypy/pull-requests/460/changes-reported-location-... Please let me know if there are changes that should be made. As noted, I'm not super happy with the tests, but am unsure what direction I should go with them. Cheers, Eli On Sat, Jun 25, 2016 at 10:26 AM, Matti Picus <matti.picus@gmail.com> wrote:
You need to add the modules to those that the class-local space is built with using a spaceconfig, so something like
class AppTestPicklingNumpy(BaseNumpyAppTest): spaceconfig = dict(usemodules=["micronumpy", "struct", "binascii"])
def setup_class(cls): if option.runappdirect and '__pypy__' not in sys.builtin_module_names: py.test.skip("pypy only test") BaseNumpyAppTest.setup_class.im_func(cls)
def test_pickle_module(self): import pickle
On 25/06/16 09:01, Eli Stevens (Gmail) wrote:
I was thinking about doing it on import of the micronumpy module (pypy/module/micronumpy/app_numpy.py).
Right now, when I try and import pickle during the tests:
$ cat pypy/module/micronumpy/test/test_pickling_app.py import sys import py
from pypy.module.micronumpy.test.test_base import BaseNumpyAppTest from pypy.conftest import option
class AppTestPicklingNumpy(BaseNumpyAppTest): def setup_class(cls): if option.runappdirect and '__pypy__' not in sys.builtin_module_names: py.test.skip("pypy only test") BaseNumpyAppTest.setup_class.im_func(cls)
def test_pickle_module(self): import pickle ... # more code
I get this error:
import struct
lib-python/2.7/pickle.py:34: _ _ _ _ _ _
from _struct import *
E (application-level) ImportError: No module named _struct
lib-python/2.7/struct.py:1: ImportError
But everything seems fine with struct:
$ ./pytest.py pypy/module/struct/test/test_struct.py ==== test session starts ==== platform linux2 -- Python 2.7.11 -- py-1.4.20 -- pytest-2.5.2 pytest-2.5.2 from /home/elis/edit/play/pypy/pytest.pyc collected 30 items
pypy/module/struct/test/test_struct.py ..............................
==== 30 passed in 11.95 seconds ====
Any idea what's going on here?
Thanks, Eli
On Fri, Jun 24, 2016 at 9:19 PM, matti picus <matti.picus@gmail.com> wrote:
Sounds reasonable. You might want to generalize it a bit by trying to import _numpypy /numpy, and setting up the replacement by whichever fails to import. Matti
On Saturday, 25 June 2016, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Heh, interestingly, if I add the following to the local dir and files when trying to unpickle under cpython, it works (note that cpython to pypy actually works out of the box, which I hadn't realized):
$ cat _numpypy/__init__.py from numpy.core import *
$ cat _numpypy/multiarray.py from numpy.core.multiarray import * import numpy.core.multiarray as _ncm _reconstruct = _ncm._reconstruct
This is obviously a total hack, and not one I'm comfortable with (since I need to use this codebase from both cpython and pypy), but it demonstrates that it's just bookkeeping that needs to change to get things to work.
My first approach would be to add a wrapper around save_global here
https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 that special-cases the global '_numpypy.multiarray' to instead be 'numpy.core.multiarray'. That seem like a reasonable thing to do?
Cheers, Eli
Any thoughts on if this approach is acceptable? Happy to incorporate feedback. I wouldn't be surprised if there are more functions than just _reconstruct that will need to be special cased, but without a concrete use case I wasn't going to complicate things. Thanks, Eli On Sat, Jun 25, 2016 at 8:19 PM, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
That did the trick.
Pull request here: https://bitbucket.org/pypy/pypy/pull-requests/460/changes-reported-location-...
Please let me know if there are changes that should be made. As noted, I'm not super happy with the tests, but am unsure what direction I should go with them.
Cheers, Eli
On Sat, Jun 25, 2016 at 10:26 AM, Matti Picus <matti.picus@gmail.com> wrote:
You need to add the modules to those that the class-local space is built with using a spaceconfig, so something like
class AppTestPicklingNumpy(BaseNumpyAppTest): spaceconfig = dict(usemodules=["micronumpy", "struct", "binascii"])
def setup_class(cls): if option.runappdirect and '__pypy__' not in sys.builtin_module_names: py.test.skip("pypy only test") BaseNumpyAppTest.setup_class.im_func(cls)
def test_pickle_module(self): import pickle
On 25/06/16 09:01, Eli Stevens (Gmail) wrote:
I was thinking about doing it on import of the micronumpy module (pypy/module/micronumpy/app_numpy.py).
Right now, when I try and import pickle during the tests:
$ cat pypy/module/micronumpy/test/test_pickling_app.py import sys import py
from pypy.module.micronumpy.test.test_base import BaseNumpyAppTest from pypy.conftest import option
class AppTestPicklingNumpy(BaseNumpyAppTest): def setup_class(cls): if option.runappdirect and '__pypy__' not in sys.builtin_module_names: py.test.skip("pypy only test") BaseNumpyAppTest.setup_class.im_func(cls)
def test_pickle_module(self): import pickle ... # more code
I get this error:
import struct
lib-python/2.7/pickle.py:34: _ _ _ _ _ _
from _struct import *
E (application-level) ImportError: No module named _struct
lib-python/2.7/struct.py:1: ImportError
But everything seems fine with struct:
$ ./pytest.py pypy/module/struct/test/test_struct.py ==== test session starts ==== platform linux2 -- Python 2.7.11 -- py-1.4.20 -- pytest-2.5.2 pytest-2.5.2 from /home/elis/edit/play/pypy/pytest.pyc collected 30 items
pypy/module/struct/test/test_struct.py ..............................
==== 30 passed in 11.95 seconds ====
Any idea what's going on here?
Thanks, Eli
On Fri, Jun 24, 2016 at 9:19 PM, matti picus <matti.picus@gmail.com> wrote:
Sounds reasonable. You might want to generalize it a bit by trying to import _numpypy /numpy, and setting up the replacement by whichever fails to import. Matti
On Saturday, 25 June 2016, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
Heh, interestingly, if I add the following to the local dir and files when trying to unpickle under cpython, it works (note that cpython to pypy actually works out of the box, which I hadn't realized):
$ cat _numpypy/__init__.py from numpy.core import *
$ cat _numpypy/multiarray.py from numpy.core.multiarray import * import numpy.core.multiarray as _ncm _reconstruct = _ncm._reconstruct
This is obviously a total hack, and not one I'm comfortable with (since I need to use this codebase from both cpython and pypy), but it demonstrates that it's just bookkeeping that needs to change to get things to work.
My first approach would be to add a wrapper around save_global here
https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 that special-cases the global '_numpypy.multiarray' to instead be 'numpy.core.multiarray'. That seem like a reasonable thing to do?
Cheers, Eli
I think I would prefer this be done in upstream numpy (which is 95% supported by PyPy's cpyext layer) rather than changing the class name when saving a _numpypy ndarray. In both cases, a warning should be emitted when loading the "wrong" object to tell the user that subtle problems may occur, for instance with complicated record dtypes or with arrays of objects. Your pull request seems OK, it needs tests of more complicated numpy types like scalars and record arrays. Again, I would be happier if it spit out some kind of warning when overriding the object name. Maybe we should merge it until we can fix upstream numpy? Does anyone else have an opinion? Matti On 29/06/16 19:59, Eli Stevens (Gmail) wrote:
Any thoughts on if this approach is acceptable? Happy to incorporate feedback.
I wouldn't be surprised if there are more functions than just _reconstruct that will need to be special cased, but without a concrete use case I wasn't going to complicate things.
Thanks, Eli
To make sure I'm understanding, are you saying that upstream/cpython numpy should pick up an alternate way to import multiarray (via _numpypy.multiarray, instead of numpy.core.multiarray), similar to how one can `import numpy` under pypy, even though the real implementation is in `_numpypy`? Thanks, Eli On Wed, Jun 29, 2016 at 11:31 AM, Matti Picus <matti.picus@gmail.com> wrote:
I think I would prefer this be done in upstream numpy (which is 95% supported by PyPy's cpyext layer) rather than changing the class name when saving a _numpypy ndarray. In both cases, a warning should be emitted when loading the "wrong" object to tell the user that subtle problems may occur, for instance with complicated record dtypes or with arrays of objects.
Your pull request seems OK, it needs tests of more complicated numpy types like scalars and record arrays. Again, I would be happier if it spit out some kind of warning when overriding the object name. Maybe we should merge it until we can fix upstream numpy? Does anyone else have an opinion? Matti
On 29/06/16 19:59, Eli Stevens (Gmail) wrote:
Any thoughts on if this approach is acceptable? Happy to incorporate feedback.
I wouldn't be surprised if there are more functions than just _reconstruct that will need to be special cased, but without a concrete use case I wasn't going to complicate things.
Thanks, Eli
_______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
Hi Eli, hi Matti, On 29 June 2016 at 21:37, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
To make sure I'm understanding, are you saying that upstream/cpython numpy should pick up an alternate way to import multiarray (via _numpypy.multiarray, instead of numpy.core.multiarray)
Hum, in my opinion we should always pickle/unpickle arrays by reproducing and expecting the exact same format as CPython's numpy, with no warnings. Any difference (e.g. with complicated dtypes) is a bug that should eventually be fixed. A bientôt, Armin.
FWVLIW, I think that conforming to upstream numpy makes the most sense. I think that the issue would go away if the `_numpypy` module were renamed to `numpy` (and appropriate things moved into `numpy.core`). Is there a technical reason to keep the actual implementation in a separately named module? Thinking larger picture, would it be possible and sensible to switch to using the slow cpyext numpy approach for compatability, then overlay custom implementation of things on top of that when speed is needed? I'm imagining a vague inverse of the cpython approach, where modules are implemented in C when the python performance isn't acceptable. Eli On Wed, Jun 29, 2016 at 10:58 PM, Armin Rigo <arigo@tunes.org> wrote:
Hi Eli, hi Matti,
On 29 June 2016 at 21:37, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
To make sure I'm understanding, are you saying that upstream/cpython numpy should pick up an alternate way to import multiarray (via _numpypy.multiarray, instead of numpy.core.multiarray)
Hum, in my opinion we should always pickle/unpickle arrays by reproducing and expecting the exact same format as CPython's numpy, with no warnings. Any difference (e.g. with complicated dtypes) is a bug that should eventually be fixed.
A bientôt,
Armin.
Hi, I'd like to use the (numerical) performances of PyPy as an equivalent to Numba's @jit decorator (https://github.com/davidbrochart/piopio). The only thing preventing that right now is the passing around (pickling) of Numpy arrays, so it would be great to have that compatibility. David. On Mon, Jul 11, 2016 at 6:43 PM, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
FWVLIW, I think that conforming to upstream numpy makes the most sense.
I think that the issue would go away if the `_numpypy` module were renamed to `numpy` (and appropriate things moved into `numpy.core`). Is there a technical reason to keep the actual implementation in a separately named module?
Thinking larger picture, would it be possible and sensible to switch to using the slow cpyext numpy approach for compatability, then overlay custom implementation of things on top of that when speed is needed? I'm imagining a vague inverse of the cpython approach, where modules are implemented in C when the python performance isn't acceptable.
Eli
On Wed, Jun 29, 2016 at 10:58 PM, Armin Rigo <arigo@tunes.org> wrote:
Hi Eli, hi Matti,
On 29 June 2016 at 21:37, Eli Stevens (Gmail) <wickedgrey@gmail.com> wrote:
To make sure I'm understanding, are you saying that upstream/cpython numpy should pick up an alternate way to import multiarray (via _numpypy.multiarray, instead of numpy.core.multiarray)
Hum, in my opinion we should always pickle/unpickle arrays by reproducing and expecting the exact same format as CPython's numpy, with no warnings. Any difference (e.g. with complicated dtypes) is a bug that should eventually be fixed.
A bientôt,
Armin.
pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
The issue with '_numpypy.multiarray' in the pickle string rather than 'numpy.core.multiarray' should be fixed on the numpypy_pickle_compat branch (thanks to Eli) A linux 64 build is available http://buildbot.pypy.org/nightly/numpypy_pickle_compat/pypy-c-jit-85727-6d90.... Eli or David or anyone who uses numpy pickle, could you check that this works as advertised? I am concerned about how compatible our pickling is with upstream numpy, but do not really use that feature of numpy so another pair of eyes would be nice before merging to default. Note this requires that http://bitbucket.org/pypy/numpy be installed since the Unpickler must be able to import numpy.core.multiarray Matti On 15/07/16 10:47, David Brochart wrote:
Hi,
I'd like to use the (numerical) performances of PyPy as an equivalent to Numba's @jit decorator (https://github.com/davidbrochart/piopio). The only thing preventing that right now is the passing around (pickling) of Numpy arrays, so it would be great to have that compatibility.
David.
On Mon, Jul 11, 2016 at 6:43 PM, Eli Stevens (Gmail) <wickedgrey@gmail.com <mailto:wickedgrey@gmail.com>> wrote:
FWVLIW, I think that conforming to upstream numpy makes the most sense.
I think that the issue would go away if the `_numpypy` module were renamed to `numpy` (and appropriate things moved into `numpy.core`). Is there a technical reason to keep the actual implementation in a separately named module?
Thinking larger picture, would it be possible and sensible to switch to using the slow cpyext numpy approach for compatability, then overlay custom implementation of things on top of that when speed is needed? I'm imagining a vague inverse of the cpython approach, where modules are implemented in C when the python performance isn't acceptable.
Eli
On Wed, Jun 29, 2016 at 10:58 PM, Armin Rigo <arigo@tunes.org <mailto:arigo@tunes.org>> wrote: > Hi Eli, hi Matti, > > On 29 June 2016 at 21:37, Eli Stevens (Gmail) <wickedgrey@gmail.com <mailto:wickedgrey@gmail.com>> wrote: >> To make sure I'm understanding, are you saying that upstream/cpython >> numpy should pick up an alternate way to import multiarray (via >> _numpypy.multiarray, instead of numpy.core.multiarray) > > Hum, in my opinion we should always pickle/unpickle arrays by > reproducing and expecting the exact same format as CPython's numpy, > with no warnings. Any difference (e.g. with complicated dtypes) is a > bug that should eventually be fixed. > > > A bientôt, > > Armin. _______________________________________________ pypy-dev mailing list pypy-dev@python.org <mailto:pypy-dev@python.org> https://mail.python.org/mailman/listinfo/pypy-dev
_______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
Hi, I verified that this version of PyPy can load a Numpy array that was pickled by CPython (and do stuff with it), but it looks like a Numpy array pickled by PyPy cannot be loaded by CPython, because PyPy still uses '_numpypy.multiarray' in the pickle string for dumping: ImportError: No module named _numpypy.multiarray David. On Sat, Jul 16, 2016 at 12:07 PM, Matti Picus <matti.picus@gmail.com> wrote:
The issue with '_numpypy.multiarray' in the pickle string rather than 'numpy.core.multiarray' should be fixed on the numpypy_pickle_compat branch (thanks to Eli) A linux 64 build is available http://buildbot.pypy.org/nightly/numpypy_pickle_compat/pypy-c-jit-85727-6d90... . Eli or David or anyone who uses numpy pickle, could you check that this works as advertised? I am concerned about how compatible our pickling is with upstream numpy, but do not really use that feature of numpy so another pair of eyes would be nice before merging to default.
Note this requires that http://bitbucket.org/pypy/numpy be installed since the Unpickler must be able to import numpy.core.multiarray Matti
On 15/07/16 10:47, David Brochart wrote:
Hi,
I'd like to use the (numerical) performances of PyPy as an equivalent to Numba's @jit decorator (https://github.com/davidbrochart/piopio). The only thing preventing that right now is the passing around (pickling) of Numpy arrays, so it would be great to have that compatibility.
David.
On Mon, Jul 11, 2016 at 6:43 PM, Eli Stevens (Gmail) < wickedgrey@gmail.com <mailto:wickedgrey@gmail.com>> wrote:
FWVLIW, I think that conforming to upstream numpy makes the most sense.
I think that the issue would go away if the `_numpypy` module were renamed to `numpy` (and appropriate things moved into `numpy.core`). Is there a technical reason to keep the actual implementation in a separately named module?
Thinking larger picture, would it be possible and sensible to switch to using the slow cpyext numpy approach for compatability, then overlay custom implementation of things on top of that when speed is needed? I'm imagining a vague inverse of the cpython approach, where modules are implemented in C when the python performance isn't acceptable.
Eli
On Wed, Jun 29, 2016 at 10:58 PM, Armin Rigo <arigo@tunes.org <mailto:arigo@tunes.org>> wrote: > Hi Eli, hi Matti, > > On 29 June 2016 at 21:37, Eli Stevens (Gmail) <wickedgrey@gmail.com <mailto:wickedgrey@gmail.com>> wrote: >> To make sure I'm understanding, are you saying that upstream/cpython >> numpy should pick up an alternate way to import multiarray (via >> _numpypy.multiarray, instead of numpy.core.multiarray) > > Hum, in my opinion we should always pickle/unpickle arrays by > reproducing and expecting the exact same format as CPython's numpy, > with no warnings. Any difference (e.g. with complicated dtypes) is a > bug that should eventually be fixed. > > > A bientôt, > > Armin. _______________________________________________ pypy-dev mailing list pypy-dev@python.org <mailto:pypy-dev@python.org> https://mail.python.org/mailman/listinfo/pypy-dev
_______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
To be more precise, PyPy pickling of Numpy arrays works fine, it is when PyPy pickles a Numpy scalar that I get the error. David. On Sat, Jul 16, 2016 at 2:04 PM, David Brochart <david.brochart@gmail.com> wrote:
Hi,
I verified that this version of PyPy can load a Numpy array that was pickled by CPython (and do stuff with it), but it looks like a Numpy array pickled by PyPy cannot be loaded by CPython, because PyPy still uses '_numpypy.multiarray' in the pickle string for dumping: ImportError: No module named _numpypy.multiarray
David.
On Sat, Jul 16, 2016 at 12:07 PM, Matti Picus <matti.picus@gmail.com> wrote:
The issue with '_numpypy.multiarray' in the pickle string rather than 'numpy.core.multiarray' should be fixed on the numpypy_pickle_compat branch (thanks to Eli) A linux 64 build is available http://buildbot.pypy.org/nightly/numpypy_pickle_compat/pypy-c-jit-85727-6d90... . Eli or David or anyone who uses numpy pickle, could you check that this works as advertised? I am concerned about how compatible our pickling is with upstream numpy, but do not really use that feature of numpy so another pair of eyes would be nice before merging to default.
Note this requires that http://bitbucket.org/pypy/numpy be installed since the Unpickler must be able to import numpy.core.multiarray Matti
On 15/07/16 10:47, David Brochart wrote:
Hi,
I'd like to use the (numerical) performances of PyPy as an equivalent to Numba's @jit decorator (https://github.com/davidbrochart/piopio). The only thing preventing that right now is the passing around (pickling) of Numpy arrays, so it would be great to have that compatibility.
David.
On Mon, Jul 11, 2016 at 6:43 PM, Eli Stevens (Gmail) < wickedgrey@gmail.com <mailto:wickedgrey@gmail.com>> wrote:
FWVLIW, I think that conforming to upstream numpy makes the most sense.
I think that the issue would go away if the `_numpypy` module were renamed to `numpy` (and appropriate things moved into `numpy.core`). Is there a technical reason to keep the actual implementation in a separately named module?
Thinking larger picture, would it be possible and sensible to switch to using the slow cpyext numpy approach for compatability, then overlay custom implementation of things on top of that when speed is needed? I'm imagining a vague inverse of the cpython approach, where modules are implemented in C when the python performance isn't acceptable.
Eli
On Wed, Jun 29, 2016 at 10:58 PM, Armin Rigo <arigo@tunes.org <mailto:arigo@tunes.org>> wrote: > Hi Eli, hi Matti, > > On 29 June 2016 at 21:37, Eli Stevens (Gmail) <wickedgrey@gmail.com <mailto:wickedgrey@gmail.com>> wrote: >> To make sure I'm understanding, are you saying that upstream/cpython >> numpy should pick up an alternate way to import multiarray (via >> _numpypy.multiarray, instead of numpy.core.multiarray) > > Hum, in my opinion we should always pickle/unpickle arrays by > reproducing and expecting the exact same format as CPython's numpy, > with no warnings. Any difference (e.g. with complicated dtypes) is a > bug that should eventually be fixed. > > > A bientôt, > > Armin. _______________________________________________ pypy-dev mailing list pypy-dev@python.org <mailto:pypy-dev@python.org> https://mail.python.org/mailman/listinfo/pypy-dev
_______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
I am not surprised that my current branch doesn't cover all cases; it was specifically targeted at my exact, singular use case. I'll work on making something more general, as well as improving test coverage. On Sat, Jul 16, 2016 at 9:29 AM, Matti Picus <matti.picus@gmail.com> wrote:
So it seems the tests are lacking. Someone should: - go through all the existing calls to dumps in tests and add "assert '_numpypy' not in data" - add tests for scalars - fix so the tests pass Matti
On 16/07/16 07:40, David Brochart wrote:
To be more precise, PyPy pickling of Numpy arrays works fine, it is when PyPy pickles a Numpy scalar that I get the error. David.
On Sat, Jul 16, 2016 at 2:04 PM, David Brochart <david.brochart@gmail.com> wrote:
Hi,
I verified that this version of PyPy can load a Numpy array that was pickled by CPython (and do stuff with it), but it looks like a Numpy array pickled by PyPy cannot be loaded by CPython, because PyPy still uses '_numpypy.multiarray' in the pickle string for dumping: ImportError: No module named _numpypy.multiarray
David.
On Sat, Jul 16, 2016 at 12:07 PM, Matti Picus <matti.picus@gmail.com> wrote:
The issue with '_numpypy.multiarray' in the pickle string rather than 'numpy.core.multiarray' should be fixed on the numpypy_pickle_compat branch (thanks to Eli) A linux 64 build is available http://buildbot.pypy.org/nightly/numpypy_pickle_compat/pypy-c-jit-85727-6d90.... Eli or David or anyone who uses numpy pickle, could you check that this works as advertised? I am concerned about how compatible our pickling is with upstream numpy, but do not really use that feature of numpy so another pair of eyes would be nice before merging to default.
Note this requires that http://bitbucket.org/pypy/numpy be installed since the Unpickler must be able to import numpy.core.multiarray Matti
On 15/07/16 10:47, David Brochart wrote:
Hi,
I'd like to use the (numerical) performances of PyPy as an equivalent to Numba's @jit decorator (https://github.com/davidbrochart/piopio). The only thing preventing that right now is the passing around (pickling) of Numpy arrays, so it would be great to have that compatibility.
David.
On Mon, Jul 11, 2016 at 6:43 PM, Eli Stevens (Gmail) <wickedgrey@gmail.com <mailto:wickedgrey@gmail.com>> wrote:
FWVLIW, I think that conforming to upstream numpy makes the most sense.
I think that the issue would go away if the `_numpypy` module were renamed to `numpy` (and appropriate things moved into `numpy.core`). Is there a technical reason to keep the actual implementation in a separately named module?
Thinking larger picture, would it be possible and sensible to switch to using the slow cpyext numpy approach for compatability, then overlay custom implementation of things on top of that when speed is needed? I'm imagining a vague inverse of the cpython approach, where modules are implemented in C when the python performance isn't acceptable.
Eli
On Wed, Jun 29, 2016 at 10:58 PM, Armin Rigo <arigo@tunes.org <mailto:arigo@tunes.org>> wrote: > Hi Eli, hi Matti, > > On 29 June 2016 at 21:37, Eli Stevens (Gmail) <wickedgrey@gmail.com <mailto:wickedgrey@gmail.com>> wrote: >> To make sure I'm understanding, are you saying that upstream/cpython >> numpy should pick up an alternate way to import multiarray (via >> _numpypy.multiarray, instead of numpy.core.multiarray) > > Hum, in my opinion we should always pickle/unpickle arrays by > reproducing and expecting the exact same format as CPython's numpy, > with no warnings. Any difference (e.g. with complicated dtypes) is a > bug that should eventually be fixed. > > > A bientôt, > > Armin. _______________________________________________ pypy-dev mailing list pypy-dev@python.org <mailto:pypy-dev@python.org> https://mail.python.org/mailman/listinfo/pypy-dev
_______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
participants (6)
-
Armin Rigo
-
David Brochart
-
Eli Stevens (Gmail)
-
matti picus
-
Matti Picus
-
William ML Leslie