I will try, I tried with the 2020-09-10 (https://buildbot.pypy.org/nightly/py3.6/pypy-c-jit-latest-linux64.tar.bz2 <https://buildbot.pypy.org/nightly/py3.6/pypy-c-jit-latest-linux64.tar.bz2>) but your changes might have not made it into that one. The errors I get is: https://github.com/apache/arrow/blob/apache-arrow-1.0.1/cpp/src/arrow/python... <https://github.com/apache/arrow/blob/apache-arrow-1.0.1/cpp/src/arrow/python...> In file included from /arrow/cpp/src/arrow/python/common.cc:18: /arrow/cpp/src/arrow/python/common.h: In member function ‘arrow::Status arrow::py::PyBytesView::FromBinary(PyObject*, const char*)’: /arrow/cpp/src/arrow/python/common.h:256:31: error: ‘PyMemoryView_GetContiguous’ was not declared in this scope PyObject* contig_view = PyMemoryView_GetContiguous(obj, PyBUF_READ, 'C'); ^~~~~~~~~~~~~~~~~~~~~~~~~~ /arrow/cpp/src/arrow/python/common.h:256:31: note: suggested alternative: ‘PyMemoryView_FromMemory’ PyObject* contig_view = PyMemoryView_GetContiguous(obj, PyBUF_READ, 'C'); ^~~~~~~~~~~~~~~~~~~~~~~~~~ PyMemoryView_FromMemory Or I might need to patch something :) Odd thing is that running it on the latest code I don’t get any errors on date time, only on above. That said, it might simply be that the errors are “below” the above errors I did a new branch for latest: https://github.com/bivald/pyarrow-on-pypy3/blob/feature/latest-pypy-latest-p...
On 10 Sep 2020, at 11:30, Matti Picus <matti.picus@gmail.com> wrote:
I implemented the easy part (without allocating and copying non-contiguous data to a new contiguous buffer) of PyMemoryView_GetContiguous, which will make it into the upcoming release. You can try it out from tonight's nightlies Matti
On 9/9/20 3:08 PM, Niklas B wrote:
Thank you, I managed to get it built using an older version (instructions and whl file for those who are interested is available at https://github.com/bivald/pyarrow-on-pypy3 <https://github.com/bivald/pyarrow-on-pypy3>)
I have setup a monthly 50USD recurring donation (under “Enplore”)
On 9 Sep 2020, at 00:06, Matti Picus <matti.picus@gmail.com <mailto:matti.picus@gmail.com>> wrote:
Have you tried the documented interface https://docs.python.org/3.6/c-api/datetime.html <https://docs.python.org/3.6/c-api/datetime.html>
which is to dispense with all that code and use
#import <datetime.h>
|PyDateTime_IMPORT|
which does the right thing on each implementation (CPython - does the PyCapsule_Import; PyPy - calls _PyDateTime_Import() )
A donation to https://opencollective.com/pypy <https://opencollective.com/pypy> is always appreciated.
Matti
On 9/9/20 12:12 AM, Niklas B wrote:
Hi,
I’ve been trying to build data science library pyarrow (the arrow library, for parquet files mainly in my case) for PyPy. I’ve gotten it working for pypy2 a few years ago, and is now trying for PyPy3. Overall I get it to build and produce a pyarrow wheel file by following the arrow instructions. So far so good. I expect a massive part of pyarrow not to work, but for my case I really only need `pandas.read_parquet`. However I am stuck trying to figure out how to use the pyppy cpyext for Datetime.
The code I’m trying to build is:
https://github.com/apache/arrow/blob/maint-0.15.x/cpp/src/arrow/python/datet... <https://github.com/apache/arrow/blob/maint-0.15.x/cpp/src/arrow/python/datet...> (older branch, better luck in building it)
Which is basically:
PyDateTime_CAPI* datetime_api = nullptr;
void InitDatetime() {
PyAcquireGIL lock;
datetime_api =
reinterpret_cast<PyDateTime_CAPI*>(PyCapsule_Import(PyDateTime_CAPSULE_NAME, 0));
if (datetime_api == nullptr) {
Py_FatalError("Could not import datetime C API");
}
}
I’ve tried about a million different ways, but I’m way outside my comfort zone :) I can get it to build by doing:
datetime_api = PyDateTimeAPI;
And also:
datetime_api = reinterpret_cast<PyDateTime_CAPI*>(PyCapsule_Import("datetime", 0));
And:
datetime_api = reinterpret_cast<PyDateTime_CAPI*>(PyCapsule_Import("datetime.datetime_CAPI", 0));
But both of these trigger the fatal error in the code after (“could not import date time C API” or "PyCapsule_Import "datetime" is not valid” or module 'datetime' has no attribute 'datetime_CAPI')
I will be posting reproducible builds once I get them working.
I am more than happy to pay 300USD to anyone (or to PyPy) who can help me getting this to run:
Import pandas d = pandas.read_parquet(‘file.parq’)
Obviously that’s not enough money to cover things but at least it’s something :) obviously all results and builds will be public
Regards, Niklas
_______________________________________________ pypy-dev mailing list pypy-dev@python.org <mailto:pypy-dev@python.org> https://mail.python.org/mailman/listinfo/pypy-dev <https://mail.python.org/mailman/listinfo/pypy-dev>
pypy-dev mailing list pypy-dev@python.org <mailto:pypy-dev@python.org> https://mail.python.org/mailman/listinfo/pypy-dev <https://mail.python.org/mailman/listinfo/pypy-dev>