[scikit-learn] Strange behavior when I add a member to a cython struct

Jeff Blackburne jblackburne at gmail.com
Sun Oct 2 02:19:53 EDT 2016


Hi,

As part of my work on PR #4899 (categorical splits for tree-based
learners), I want to add a pointer member to the Node struct in
sklearn/tree/_tree.pxd. But when I do this, it causes some of the unit
tests to fail in the 32-bit Appveyor (Windows) CI. (Actually, it usually
causes them to hang indefinitely.) I'm testing this with the latest commit
on master.

The patch I'm applying is listed in full below; it's tiny. If you like, I
can make a new PR to demonstrate the behavior.

Does anyone know why this would happen, and only on 32-bit windows?

Thanks,
Jeff


```
diff --git a/sklearn/tree/_tree.pxd b/sklearn/tree/_tree.pxd
index dbf0545..b80e7bb 100644
--- a/sklearn/tree/_tree.pxd
+++ b/sklearn/tree/_tree.pxd
@@ -32,6 +32,7 @@ cdef struct Node:
     DOUBLE_t impurity                    # Impurity of the node (i.e., the
value of the criterion)
     SIZE_t n_node_samples                # Number of samples at the node
     DOUBLE_t weighted_n_node_samples     # Weighted number of samples at
the node
+    UINT32_t *foo


 cdef class Tree:
diff --git a/sklearn/tree/_tree.pyx b/sklearn/tree/_tree.pyx
index 4e8160f..a2f8117 100644
--- a/sklearn/tree/_tree.pyx
+++ b/sklearn/tree/_tree.pyx
@@ -68,9 +68,9 @@ cdef SIZE_t INITIAL_STACK_SIZE = 10
 # Repeat struct definition for numpy
 NODE_DTYPE = np.dtype({
     'names': ['left_child', 'right_child', 'feature', 'threshold',
'impurity',
-              'n_node_samples', 'weighted_n_node_samples'],
+              'n_node_samples', 'weighted_n_node_samples', 'foo'],
     'formats': [np.intp, np.intp, np.intp, np.float64, np.float64, np.intp,
-                np.float64],
+                np.float64, np.intp],
     'offsets': [
         <Py_ssize_t> &(<Node*> NULL).left_child,
         <Py_ssize_t> &(<Node*> NULL).right_child,
@@ -78,7 +78,8 @@ NODE_DTYPE = np.dtype({
         <Py_ssize_t> &(<Node*> NULL).threshold,
         <Py_ssize_t> &(<Node*> NULL).impurity,
         <Py_ssize_t> &(<Node*> NULL).n_node_samples,
-        <Py_ssize_t> &(<Node*> NULL).weighted_n_node_samples
+        <Py_ssize_t> &(<Node*> NULL).weighted_n_node_samples,
+        <Py_ssize_t> &(<Node*> NULL).foo
     ]
 })
```
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161001/49e1be53/attachment.html>


More information about the scikit-learn mailing list