[pypy-dev] PyArrow on PyPy: PyDateTime_CAPI question (getting cpyext to work) in c++

Niklas B niklas.bivald at enplore.com
Wed Sep 9 08:08:15 EDT 2020


Thank you, I managed to get it built using an older version (instructions and whl file for those who are interested is available at https://github.com/bivald/pyarrow-on-pypy3 <https://github.com/bivald/pyarrow-on-pypy3>)

I have setup a monthly 50USD recurring donation (under “Enplore”)

> On 9 Sep 2020, at 00:06, Matti Picus <matti.picus at gmail.com> wrote:
> 
> Have you tried the documented interface https://docs.python.org/3.6/c-api/datetime.html
> 
> which is to dispense with all that code and use
> 
> 
> #import <datetime.h>
> 
> |PyDateTime_IMPORT|
> 
> 
> which does the right thing on each implementation (CPython - does the PyCapsule_Import; PyPy - calls _PyDateTime_Import() )
> 
> 
> A donation to https://opencollective.com/pypy is always appreciated.
> 
> Matti
> 
> 
> On 9/9/20 12:12 AM, Niklas B wrote:
>> Hi,
>> 
>> I’ve been trying to build data science library pyarrow (the arrow library, for parquet files mainly in my case) for PyPy. I’ve gotten it working for pypy2 a few years ago, and is now trying for PyPy3. Overall I get it to build and produce a pyarrow wheel file by following the arrow instructions. So far so good. I expect a massive part of pyarrow not to work, but for my case I really only need `pandas.read_parquet`. However I am stuck trying to figure out how to use the pyppy cpyext for Datetime.
>> 
>> The code I’m trying to build is:
>> 
>> https://github.com/apache/arrow/blob/maint-0.15.x/cpp/src/arrow/python/datetime.cc#L37 (older branch, better luck in building it)
>> 
>> Which is basically:
>> 
>> PyDateTime_CAPI* datetime_api = nullptr;
>> 
>> 	
>> 
>> 	void InitDatetime() {
>> 
>> 	PyAcquireGIL lock;
>> 
>> 	datetime_api =
>> 
>> 	reinterpret_cast<PyDateTime_CAPI*>(PyCapsule_Import(PyDateTime_CAPSULE_NAME, 0));
>> 
>> 	if (datetime_api == nullptr) {
>> 
>> 	Py_FatalError("Could not import datetime C API");
>> 
>> 	}
>> 
>> 	}
>> 
>> 
>> I’ve tried about a million different ways, but I’m way outside my comfort zone :) I can get it to build by doing:
>> 
>> datetime_api = PyDateTimeAPI;
>> 
>> And also:
>> 
>> datetime_api = reinterpret_cast<PyDateTime_CAPI*>(PyCapsule_Import("datetime", 0));
>> 
>> And:
>> 
>> datetime_api = reinterpret_cast<PyDateTime_CAPI*>(PyCapsule_Import("datetime.datetime_CAPI", 0));
>> 
>> But both of these trigger the fatal error in the code after (“could not import date time C API” or "PyCapsule_Import "datetime" is not valid” or module 'datetime' has no attribute 'datetime_CAPI')
>> 
>> I will be posting reproducible builds once I get them working.
>> 
>> I am more than happy to pay 300USD to anyone (or to PyPy) who can help me getting this to run:
>> 
>> Import pandas
>> d = pandas.read_parquet(‘file.parq’)
>> 
>> Obviously that’s not enough money to cover things but at least it’s something :) obviously all results and builds will be public
>> 
>> Regards,
>> Niklas
>> 
>> _______________________________________________
>> pypy-dev mailing list
>> pypy-dev at python.org
>> https://mail.python.org/mailman/listinfo/pypy-dev
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20200909/8421a2f5/attachment-0001.html>


More information about the pypy-dev mailing list