Yep, I can confirm your patch builds and works for parquet the very least (https://github.com/bivald/pyarrow-on-pypy3/tree/feature/latest-pypy-latest-pyarrow)

I would say it works surprisingly well for parquet:

======================================================================================= test session starts ========================================================================================
platform linux -- Python 3.6.9[pypy-7.3.3-alpha], pytest-6.0.2, py-1.9.0, pluggy-0.13.1
rootdir: /arrow/python, configfile: setup.cfg
plugins: hypothesis-5.35.1
collected 286 items

pyarrow2/tests/test_parquet.py .s.s.s.s..s.s.s.s.ss.s.s.s.s.s.s..s.s.s.s.s.s.s.s.s.s..s.s..s.s.................................s.s...........s..s.s.sxs.s.s.sxs.sxsss....s.s.s.s....s.s.s.s. [ 54%]
s.sxsxs.s...s.s.s.s.s.s..s.ssssssss.s.s.s.s.s.s.s.sx.s.s.s....s..s.s..s.s.s..s.s.s.s..s.s...ss..s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.sss                                                            [100%]

=========================================================================== 164 passed, 116 skipped, 6 xfailed in 43.69s ===========================================================================

Overall the entire test suite:

====================================================================================================== short test summary info ======================================================================================================
FAILED pyarrow2/tests/test_array.py::test_to_pandas_zero_copy - AttributeError: module 'sys' has no attribute 'getrefcount'
FAILED pyarrow2/tests/test_array.py::test_array_slice - SystemError: Function returned an error result without setting an exception
FAILED pyarrow2/tests/test_array.py::test_array_ref_to_ndarray_base - AttributeError: module 'sys' has no attribute 'getrefcount'
FAILED pyarrow2/tests/test_array.py::test_array_conversions_no_sentinel_values - AttributeError: module 'sys' has no attribute 'getrefcount'
FAILED pyarrow2/tests/test_array.py::test_nbytes_sizeof - TypeError: getsizeof(...)
FAILED pyarrow2/tests/test_cffi.py::test_export_import_array - assert 1528 == 896
FAILED pyarrow2/tests/test_cffi.py::test_export_import_batch - assert 1048 == 128
FAILED pyarrow2/tests/test_convert_builtin.py::test_garbage_collection - assert 128 == 766912
FAILED pyarrow2/tests/test_convert_builtin.py::test_sequence_bytes - NotImplementedError: creating contiguous readonly buffer from non-contiguous not implemented yet
FAILED pyarrow2/tests/test_convert_builtin.py::test_map_from_dicts - AssertionError: Regex pattern 'integer is required' does not match 'expected integer, got str object'.
FAILED pyarrow2/tests/test_csv.py::test_read_options - Failed: DID NOT RAISE <class 'AttributeError'>
FAILED pyarrow2/tests/test_csv.py::test_parse_options - Failed: DID NOT RAISE <class 'AttributeError'>
FAILED pyarrow2/tests/test_csv.py::test_convert_options - Failed: DID NOT RAISE <class 'AttributeError'>
FAILED pyarrow2/tests/test_csv.py::TestSerialStreamingCSVRead::test_batch_lifetime - AssertionError: assert 1464704 == 1464576
FAILED pyarrow2/tests/test_cython.py::test_cython_api - subprocess.CalledProcessError: Command '['/pyarrow/bin/pypy3', 'setup.py', 'build_ext', '--inplace']' returned non-zero exit status 1.
FAILED pyarrow2/tests/test_extension_type.py::test_ext_type__lifetime - AssertionError: assert UuidType(extension<arrow.py_extension_type>) is None
FAILED pyarrow2/tests/test_extension_type.py::test_uuid_type_pickle - AssertionError: assert UuidType(extension<arrow.py_extension_type>) is None
FAILED pyarrow2/tests/test_extension_type.py::test_ext_array_lifetime - AssertionError: assert ParamExtType(extension<arrow.py_extension_type>) is None
FAILED pyarrow2/tests/test_fs.py::test_py_filesystem_lifetime - AssertionError: assert <pyarrow2.tests.test_fs.DummyHandler object at 0x0000000003e696a8> is None
FAILED pyarrow2/tests/test_pandas.py::test_to_pandas_deduplicate_integers_as_objects - assert 100 == 991
FAILED pyarrow2/tests/test_pandas.py::test_array_uses_memory_pool - assert 103552 == 465152
FAILED pyarrow2/tests/test_pandas.py::test_to_pandas_self_destruct - assert 6112064 == 4112064
FAILED pyarrow2/tests/test_pandas.py::test_table_uses_memory_pool - assert 6249408 == 6112064
FAILED pyarrow2/tests/test_pandas.py::test_object_leak_in_numpy_array - AttributeError: module 'sys' has no attribute 'getrefcount'
FAILED pyarrow2/tests/test_pandas.py::test_object_leak_in_dataframe - AttributeError: module 'sys' has no attribute 'getrefcount'
FAILED pyarrow2/tests/test_schema.py::test_schema_sizeof - TypeError: getsizeof(...)
FAILED pyarrow2/tests/test_sparse_tensor.py::test_sparse_coo_tensor_base_object - AttributeError: module 'sys' has no attribute 'getrefcount'
FAILED pyarrow2/tests/test_sparse_tensor.py::test_sparse_csr_matrix_base_object - AttributeError: module 'sys' has no attribute 'getrefcount'
FAILED pyarrow2/tests/test_sparse_tensor.py::test_sparse_csf_tensor_base_object - AttributeError: module 'sys' has no attribute 'getrefcount'
FAILED pyarrow2/tests/test_table.py::test_chunked_array_basics - TypeError: getsizeof(...)
FAILED pyarrow2/tests/test_table.py::test_recordbatch_basics - TypeError: getsizeof(...)
FAILED pyarrow2/tests/test_table.py::test_table_basics - TypeError: getsizeof(...)
FAILED pyarrow2/tests/test_tensor.py::test_tensor_base_object - AttributeError: module 'sys' has no attribute 'getrefcount'
========================================================================= 33 failed, 2620 passed, 532 skipped, 13 xfailed, 10 warnings in 104.02s (0:01:44) =========================================================================

Segfaults on a couple of the tests, but a lot passes as well

On 11 Sep 2020, at 14:18, Matti Picus <matti.picus@gmail.com> wrote:

Thse changes seems to compile for me, but I did not run the tests.

https://gist.github.com/mattip/c9c8398b58721ae5893dc8134c353f28

Matti


On 9/11/20 1:01 PM, Niklas B wrote:
The PyMemoryView_GetContiguous errors are all gone - good work!

It didn’t really like my butchering of datetime.cc <http://datetime.cc> https://github.com/apache/arrow/blob/apache-arrow-1.0.1/cpp/src/arrow/python/datetime.cc#L37

Added:
#include <datetime.h>

And
PyDateTime_IMPORT



Then changed:
datetime_api =

reinterpret_cast<PyDateTime_CAPI*>(PyCapsule_Import(PyDateTime_CAPSULE_NAME, 0));


To:

  datetime_api = PyDateTimeAPI;

They do some PyDateTimeAPI voodoo at https://github.com/apache/arrow/blob/apache-arrow-1.0.1/cpp/src/arrow/python/datetime.h#L29 which might be a culprit as well.









Gives the following outputs:
/arrow/cpp/src/arrow/python/datetime.h:34: warning: "PyDateTimeAPI" redefined
 #define PyDateTimeAPI ::arrow::py::internal::datetime_api

In file included from /opt/pypy/include/Python.h:144,
                 from /arrow/cpp/src/arrow/python/platform.h:23,
                 from /arrow/cpp/src/arrow/python/pyarrow.h:20,
                 from /arrow/cpp/src/arrow/python/common.h:24,
                 from /arrow/cpp/src/arrow/python/datetime.cc:24 <http://datetime.cc:24>:
/opt/pypy/include/pypy_decl.h:1121: note: this is the location of the previous definition
 #define PyDateTimeAPI PyPyDateTimeAPI

In file included from /opt/pypy/include/datetime.h:7,
                 from /arrow/cpp/src/arrow/python/datetime.cc:22 <http://datetime.cc:22>:
/opt/pypy/include/cpyext_datetime.h:4:5: error: ‘PyTypeObject’ does not name a type; did you mean ‘PyType_Check’?
     PyTypeObject *DateType;
     ^~~~~~~~~~~~
     PyType_Check
/opt/pypy/include/cpyext_datetime.h:5:5: error: ‘PyTypeObject’ does not name a type; did you mean ‘PyType_Check’?
     PyTypeObject *DateTimeType;
     ^~~~~~~~~~~~
     PyType_Check
/opt/pypy/include/cpyext_datetime.h:6:5: error: ‘PyTypeObject’ does not name a type; did you mean ‘PyType_Check’?
     PyTypeObject *TimeType;
     ^~~~~~~~~~~~
     PyType_Check
/opt/pypy/include/cpyext_datetime.h:7:5: error: ‘PyTypeObject’ does not name a type; did you mean ‘PyType_Check’?
     PyTypeObject *DeltaType;
     ^~~~~~~~~~~~
     PyType_Check
/opt/pypy/include/cpyext_datetime.h:8:5: error: ‘PyTypeObject’ does not name a type; did you mean ‘PyType_Check’?
     PyTypeObject *TZInfoType;
     ^~~~~~~~~~~~
     PyType_Check
/opt/pypy/include/cpyext_datetime.h:11:5: error: ‘PyObject’ does not name a type; did you mean ‘PyObject_New’?
     PyObject *(*Date_FromDate)(int, int, int, PyTypeObject*);
     ^~~~~~~~
     PyObject_New
/opt/pypy/include/cpyext_datetime.h:12:5: error: ‘PyObject’ does not name a type; did you mean ‘PyObject_New’?
     PyObject *(*DateTime_FromDateAndTime)(int, int, int, int, int, int, int,
     ^~~~~~~~
     PyObject_New
/opt/pypy/include/cpyext_datetime.h:14:5: error: ‘PyObject’ does not name a type; did you mean ‘PyObject_New’?
     PyObject *(*Time_FromTime)(int, int, int, int, PyObject*, PyTypeObject*);
     ^~~~~~~~
     PyObject_New
/opt/pypy/include/cpyext_datetime.h:15:5: error: ‘PyObject’ does not name a type; did you mean ‘PyObject_New’?
     PyObject *(*Delta_FromDelta)(int, int, int, int, PyTypeObject*);
     ^~~~~~~~
     PyObject_New
/opt/pypy/include/cpyext_datetime.h:18:5: error: ‘PyObject’ does not name a type; did you mean ‘PyObject_New’?
     PyObject *(*DateTime_FromTimestamp)(PyObject*, PyObject*, PyObject*);
     ^~~~~~~~
     PyObject_New
/opt/pypy/include/cpyext_datetime.h:19:5: error: ‘PyObject’ does not name a type; did you mean ‘PyObject_New’?
     PyObject *(*Date_FromTimestamp)(PyObject*, PyObject*);
     ^~~~~~~~
     PyObject_New
/opt/pypy/include/cpyext_datetime.h:24:5: error: ‘PyObject_HEAD’ does not name a type
     PyObject_HEAD
     ^~~~~~~~~~~~~
/opt/pypy/include/cpyext_datetime.h:24:5: note: the macro ‘PyObject_HEAD’ had not yet been defined
In file included from /opt/pypy/include/object.h:10,
                 from /opt/pypy/include/Python.h:79,
                 from /arrow/cpp/src/arrow/python/platform.h:23,
                 from /arrow/cpp/src/arrow/python/pyarrow.h:20,
                 from /arrow/cpp/src/arrow/python/common.h:24,
                 from /arrow/cpp/src/arrow/python/datetime.cc:24 <http://datetime.cc:24>:
/opt/pypy/include/cpyext_object.h:5: note: it was later defined here
 #define PyObject_HEAD  \

In file included from /opt/pypy/include/datetime.h:7,
                 from /arrow/cpp/src/arrow/python/datetime.cc:22 <http://datetime.cc:22>:
/opt/pypy/include/cpyext_datetime.h:35:5: error: ‘PyObject_HEAD’ does not name a type
     PyObject_HEAD
     ^~~~~~~~~~~~~
/opt/pypy/include/cpyext_datetime.h:35:5: note: the macro ‘PyObject_HEAD’ had not yet been defined
In file included from /opt/pypy/include/object.h:10,
                 from /opt/pypy/include/Python.h:79,
                 from /arrow/cpp/src/arrow/python/platform.h:23,
                 from /arrow/cpp/src/arrow/python/pyarrow.h:20,
                 from /arrow/cpp/src/arrow/python/common.h:24,
                 from /arrow/cpp/src/arrow/python/datetime.cc:24 <http://datetime.cc:24>:
/opt/pypy/include/cpyext_object.h:5: note: it was later defined here
 #define PyObject_HEAD  \

In file included from /opt/pypy/include/datetime.h:7,
                 from /arrow/cpp/src/arrow/python/datetime.cc:22 <http://datetime.cc:22>:
/opt/pypy/include/cpyext_datetime.h:37:5: error: ‘PyObject’ does not name a type; did you mean ‘PyObject_New’?
     PyObject *tzinfo;
     ^~~~~~~~
     PyObject_New
/opt/pypy/include/cpyext_datetime.h:42:5: error: ‘PyObject_HEAD’ does not name a type
     PyObject_HEAD
     ^~~~~~~~~~~~~
/opt/pypy/include/cpyext_datetime.h:42:5: note: the macro ‘PyObject_HEAD’ had not yet been defined
In file included from /opt/pypy/include/object.h:10,
                 from /opt/pypy/include/Python.h:79,
                 from /arrow/cpp/src/arrow/python/platform.h:23,
                 from /arrow/cpp/src/arrow/python/pyarrow.h:20,
                 from /arrow/cpp/src/arrow/python/common.h:24,
                 from /arrow/cpp/src/arrow/python/datetime.cc:24 <http://datetime.cc:24>:
/opt/pypy/include/cpyext_object.h:5: note: it was later defined here
 #define PyObject_HEAD  \

In file included from /opt/pypy/include/datetime.h:7,
                 from /arrow/cpp/src/arrow/python/datetime.cc:22 <http://datetime.cc:22>:
/opt/pypy/include/cpyext_datetime.h:44:5: error: ‘PyObject’ does not name a type; did you mean ‘PyObject_New’?
     PyObject *tzinfo;
     ^~~~~~~~
     PyObject_New
/opt/pypy/include/cpyext_datetime.h:49:5: error: ‘PyObject_HEAD’ does not name a type
     PyObject_HEAD
     ^~~~~~~~~~~~~
/opt/pypy/include/cpyext_datetime.h:49:5: note: the macro ‘PyObject_HEAD’ had not yet been defined
In file included from /opt/pypy/include/object.h:10,
                 from /opt/pypy/include/Python.h:79,
                 from /arrow/cpp/src/arrow/python/platform.h:23,
                 from /arrow/cpp/src/arrow/python/pyarrow.h:20,
                 from /arrow/cpp/src/arrow/python/common.h:24,
                 from /arrow/cpp/src/arrow/python/datetime.cc:24 <http://datetime.cc:24>:
/opt/pypy/include/cpyext_object.h:5: note: it was later defined here
 #define PyObject_HEAD  \

In file included from /opt/pypy/include/datetime.h:7,
                 from /arrow/cpp/src/arrow/python/datetime.cc:22 <http://datetime.cc:22>:
/opt/pypy/include/cpyext_datetime.h:54:5: error: ‘PyObject_HEAD’ does not name a type
     PyObject_HEAD
     ^~~~~~~~~~~~~
/opt/pypy/include/cpyext_datetime.h:54:5: note: the macro ‘PyObject_HEAD’ had not yet been defined
In file included from /opt/pypy/include/object.h:10,
                 from /opt/pypy/include/Python.h:79,
                 from /arrow/cpp/src/arrow/python/platform.h:23,
                 from /arrow/cpp/src/arrow/python/pyarrow.h:20,
                 from /arrow/cpp/src/arrow/python/common.h:24,
                 from /arrow/cpp/src/arrow/python/datetime.cc:24 <http://datetime.cc:24>:
/opt/pypy/include/cpyext_object.h:5: note: it was later defined here
 #define PyObject_HEAD  \

In file included from /arrow/cpp/src/arrow/python/datetime.cc:22 <http://datetime.cc:22>:
/opt/pypy/include/datetime.h:9:30: error: expected constructor, destructor, or type conversion before ‘PyDateTimeAPI’
 PyAPI_DATA(PyDateTime_CAPI*) PyDateTimeAPI;
                              ^~~~~~~~~~~~~
/arrow/cpp/src/arrow/python/datetime.cc:37 <http://datetime.cc:37>:1: error: expected ‘)’ before ‘=’ token
 PyDateTime_IMPORT
 ^~~~~~~~~~~~~~~~~
/arrow/cpp/src/arrow/python/datetime.cc <http://datetime.cc>: In function ‘arrow::Status arrow::py::internal::PyTime_from_int(int64_t, arrow::TimeUnit::type, PyObject**)’:
/arrow/cpp/src/arrow/python/datetime.cc:237 <http://datetime.cc:237>:10: error: ‘struct PyDateTime_CAPI’ has no member named ‘Time_FromTime’
   *out = PyTime_FromTime(static_cast<int32_t>(hour), static_cast<int32_t>(minute),
          ^~~~~~~~~~~~~~~
/arrow/cpp/src/arrow/python/datetime.cc:237 <http://datetime.cc:237>:10: error: ‘struct PyDateTime_CAPI’ has no member named ‘TimeType’
   *out = PyTime_FromTime(static_cast<int32_t>(hour), static_cast<int32_t>(minute),
          ^~~~~~~~~~~~~~~
/arrow/cpp/src/arrow/python/datetime.cc <http://datetime.cc>: In function ‘arrow::Status arrow::py::internal::PyDate_from_int(int64_t, arrow::DateUnit, PyObject**)’:
/arrow/cpp/src/arrow/python/datetime.cc:245 <http://datetime.cc:245>:10: error: ‘struct PyDateTime_CAPI’ has no member named ‘Date_FromDate’
   *out = PyDate_FromDate(static_cast<int32_t>(year), static_cast<int32_t>(month),
          ^~~~~~~~~~~~~~~
/arrow/cpp/src/arrow/python/datetime.cc:245 <http://datetime.cc:245>:10: error: ‘struct PyDateTime_CAPI’ has no member named ‘DateType’
   *out = PyDate_FromDate(static_cast<int32_t>(year), static_cast<int32_t>(month),
          ^~~~~~~~~~~~~~~
/arrow/cpp/src/arrow/python/datetime.cc <http://datetime.cc>: In function ‘arrow::Status arrow::py::internal::PyDateTime_from_int(int64_t, arrow::TimeUnit::type, PyObject**)’:
/arrow/cpp/src/arrow/python/datetime.cc:257 <http://datetime.cc:257>:10: error: ‘struct PyDateTime_CAPI’ has no member named ‘DateTime_FromDateAndTime’
   *out = PyDateTime_FromDateAndTime(
          ^~~~~~~~~~~~~~~~~~~~~~~~~~
/arrow/cpp/src/arrow/python/datetime.cc:257 <http://datetime.cc:257>:10: error: ‘struct PyDateTime_CAPI’ has no member named ‘DateTimeType’
   *out = PyDateTime_FromDateAndTime(
          ^~~~~~~~~~~~~~~~~~~~~~~~~~
make[2]: *** [src/arrow/python/CMakeFiles/arrow_python_objlib.dir/build.make:121: src/arrow/python/CMakeFiles/arrow_python_objlib.dir/datetime.cc.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:1628: src/arrow/python/CMakeFiles/arrow_python_objlib.dir/all] Error



On 10 Sep 2020, at 13:30, Niklas B <niklas.bivald@enplore.com <mailto:niklas.bivald@enplore.com>> wrote:

That’s what I figured, cool, I will try it!

On 10 Sep 2020, at 13:28, Matti Picus <matti.picus@gmail.com <mailto:matti.picus@gmail.com>> wrote: