From szport at gmail.com  Fri Mar  1 07:31:14 2013
From: szport at gmail.com (ZS)
Date: Fri, 1 Mar 2013 09:31:14 +0300
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <512FC919.4010702@behnel.de>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
Message-ID: <CAPOE21TZopn873XsR1J8VCJ9gfyCpnEs9sFqXGQdBVLV7ZtJxQ@mail.gmail.com>

2013/3/1 Stefan Behnel <stefan_ml at behnel.de>:
> ZS, 28.02.2013 21:07:
>> 2013/2/28 Stefan Behnel:
>>>> This allows to write unicode text parsing code almost at C speed
>>>> mostly in python (+ .pxd defintions).
>>>
>>> I suggest simply adding a constant flag argument to the existing function
>>> that states if checking should be done or not. Inlining will let the C
>>> compiler drop the corresponding code, which may or may nor make it a little
>>> faster.
>>
>> static inline Py_UCS4 unicode_char2(PyObject* ustring, Py_ssize_t i, int flag) {
>>     Py_ssize_t length;
>> #if CYTHON_PEP393_ENABLED
>>     if (PyUnicode_READY(ustring) < 0) return (Py_UCS4)-1;
>> #endif
>>     if (flag) {
>>         length = __Pyx_PyUnicode_GET_LENGTH(ustring);
>>         if ((0 <= i) & (i < length)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>         } else if ((-length <= i) & (i < 0)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i + length);
>>         } else {
>>             PyErr_SetString(PyExc_IndexError, "string index out of range");
>>             return (Py_UCS4)-1;
>>         }
>>     } else {
>>         return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>     }
>> }
>
> I think you could even pass in two flags, one for wraparound and one for
> boundscheck, and then just evaluate them appropriately in the existing "if"
> tests above. That should allow both features to be supported independently
> in a fast way.
>
>
>> Here are timings:
>>
>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>> mytests.unicode_index import test_1" "test_1()"
>> 50 loops, best of 5: 152 msec per loop
>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>> mytests.unicode_index import test_2" "test_2()"
>> 50 loops, best of 5: 86.5 msec per loop
>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>> mytests.unicode_index import test_3" "test_3()"
>> 50 loops, best of 5: 86.5 msec per loop
>>
>> So your suggestion would be preferable.
>
> Nice. Yes, looks like it' worth it.
>

Sure that same could be applied to unicode slicing too.

Zaur Shibzukhov

From szport at gmail.com  Fri Mar  1 07:43:34 2013
From: szport at gmail.com (ZS)
Date: Fri, 1 Mar 2013 09:43:34 +0300
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <512FC919.4010702@behnel.de>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
Message-ID: <CAPOE21ROETxOeDd2uk3JDVC=s8xiT9EoMhkZ4DyiU7vWNXW8Nw@mail.gmail.com>

2013/3/1 Stefan Behnel <stefan_ml at behnel.de>:
> ZS, 28.02.2013 21:07:
>> 2013/2/28 Stefan Behnel:
>>>> This allows to write unicode text parsing code almost at C speed
>>>> mostly in python (+ .pxd defintions).
>>>
>>> I suggest simply adding a constant flag argument to the existing function
>>> that states if checking should be done or not. Inlining will let the C
>>> compiler drop the corresponding code, which may or may nor make it a little
>>> faster.
>>
>> static inline Py_UCS4 unicode_char2(PyObject* ustring, Py_ssize_t i, int flag) {
>>     Py_ssize_t length;
>> #if CYTHON_PEP393_ENABLED
>>     if (PyUnicode_READY(ustring) < 0) return (Py_UCS4)-1;
>> #endif
>>     if (flag) {
>>         length = __Pyx_PyUnicode_GET_LENGTH(ustring);
>>         if ((0 <= i) & (i < length)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>         } else if ((-length <= i) & (i < 0)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i + length);
>>         } else {
>>             PyErr_SetString(PyExc_IndexError, "string index out of range");
>>             return (Py_UCS4)-1;
>>         }
>>     } else {
>>         return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>     }
>> }
>
> I think you could even pass in two flags, one for wraparound and one for
> boundscheck, and then just evaluate them appropriately in the existing "if"
> tests above. That should allow both features to be supported independently
> in a fast way.
>
Intresting, could C compilers in optimization mode to eliminate unused
evaluation path in nested if statements with constant conditional
expressions?

From stefan_ml at behnel.de  Fri Mar  1 07:46:30 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 01 Mar 2013 07:46:30 +0100
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <CAPOE21ROETxOeDd2uk3JDVC=s8xiT9EoMhkZ4DyiU7vWNXW8Nw@mail.gmail.com>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
	<CAPOE21ROETxOeDd2uk3JDVC=s8xiT9EoMhkZ4DyiU7vWNXW8Nw@mail.gmail.com>
Message-ID: <51304EC6.9050300@behnel.de>

ZS, 01.03.2013 07:43:
> 2013/3/1 Stefan Behnel:
>> ZS, 28.02.2013 21:07:
>>> 2013/2/28 Stefan Behnel:
>>>>> This allows to write unicode text parsing code almost at C speed
>>>>> mostly in python (+ .pxd defintions).
>>>>
>>>> I suggest simply adding a constant flag argument to the existing function
>>>> that states if checking should be done or not. Inlining will let the C
>>>> compiler drop the corresponding code, which may or may nor make it a little
>>>> faster.
>>>
>>> static inline Py_UCS4 unicode_char2(PyObject* ustring, Py_ssize_t i, int flag) {
>>>     Py_ssize_t length;
>>> #if CYTHON_PEP393_ENABLED
>>>     if (PyUnicode_READY(ustring) < 0) return (Py_UCS4)-1;
>>> #endif
>>>     if (flag) {
>>>         length = __Pyx_PyUnicode_GET_LENGTH(ustring);
>>>         if ((0 <= i) & (i < length)) {
>>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>>         } else if ((-length <= i) & (i < 0)) {
>>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i + length);
>>>         } else {
>>>             PyErr_SetString(PyExc_IndexError, "string index out of range");
>>>             return (Py_UCS4)-1;
>>>         }
>>>     } else {
>>>         return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>>     }
>>> }
>>
>> I think you could even pass in two flags, one for wraparound and one for
>> boundscheck, and then just evaluate them appropriately in the existing "if"
>> tests above. That should allow both features to be supported independently
>> in a fast way.
>>
> Intresting, could C compilers in optimization mode to eliminate unused
> evaluation path in nested if statements with constant conditional
> expressions?

They'd be worthless if they didn't do that. (Even Cython does it, BTW.)

Stefan


From szport at gmail.com  Fri Mar  1 07:54:56 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Fri, 1 Mar 2013 09:54:56 +0300
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <51304EC6.9050300@behnel.de>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
	<CAPOE21ROETxOeDd2uk3JDVC=s8xiT9EoMhkZ4DyiU7vWNXW8Nw@mail.gmail.com>
	<51304EC6.9050300@behnel.de>
Message-ID: <CAPOE21Q10UHuMY0sQfsYc9kAHXAvsCp7ZNH-zNwAreu6=EgpLA@mail.gmail.com>

>>>
>>> I think you could even pass in two flags, one for wraparound and one for
>>> boundscheck, and then just evaluate them appropriately in the existing "if"
>>> tests above. That should allow both features to be supported independently
>>> in a fast way.
>>>
>> Intresting, could C compilers in optimization mode to eliminate unused
>> evaluation path in nested if statements with constant conditional
>> expressions?
>
> They'd be worthless if they didn't do that. (Even Cython does it, BTW.)
>
Then it can simplify writing utility code in order to support
different optimization flags in other cases too.

From robertwb at gmail.com  Fri Mar  1 08:25:09 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Thu, 28 Feb 2013 23:25:09 -0800
Subject: [Cython] Be more forgiving about memoryview strides
In-Reply-To: <CAPJVwBnzWzRjv+tYtrv7D4rpN-Zhoj9X+WXp5H7N28p1i3-vLg@mail.gmail.com>
References: <1362064397.2663.14.camel@sebastian-laptop>
	<CADiQ+QA0vF1YLtDih5iD7gn4MGhPnnnX845mUx3+2cwhyxK9Sg@mail.gmail.com>
	<CAPJVwBnzWzRjv+tYtrv7D4rpN-Zhoj9X+WXp5H7N28p1i3-vLg@mail.gmail.com>
Message-ID: <CADiQ+QD_iAR96AS0iA4fVJWprosywN_SRdAM7XZhH5c9Bgjpfg@mail.gmail.com>

On Thu, Feb 28, 2013 at 11:12 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On Thu, Feb 28, 2013 at 5:50 PM, Robert Bradshaw <robertwb at gmail.com> wrote:
>> On Thu, Feb 28, 2013 at 7:13 AM, Sebastian Berg
>> <sebastian at sipsolutions.net> wrote:
>>> Hey,
>>>
>>> Maybe someone here already saw it (I don't have a track account, or I
>>> would just create a ticket), but it would be nice if Cython was more
>>> forgiving about contiguous requirements on strides. In the future this
>>> would make it easier for numpy to go forward with changing the
>>> contiguous flags to be more reasonable for its purpose, and second also
>>> to allow old (and maybe for the moment remaining) corner cases in numpy
>>> to slip past (as well as possibly the same for other programs...). An
>>> example is (see also https://github.com/numpy/numpy/issues/2956 and the
>>> PR linked there for more details):
>>>
>>> def add_one(array):
>>>     cdef double[::1] a = array
>>>     a[0] += 1.
>>>     return array
>>>
>>> giving:
>>>
>>>>>> add_one(np.ascontiguousarray(np.arange(10.)[::100]))
>>> ValueError: Buffer and memoryview are not contiguous in the same
>>> dimension.
>>>
>>> This could easily be changed if MemoryViews check the strides as "can be
>>> interpreted as contiguous". That means that if shape[i] == 1, then
>>> strides[i] are arbitrary (you can just change them if you like). This is
>>> also the case for 0-sized arrays, which are arguably always contiguous,
>>> no matter their strides are!
>>
>> I was under the impression that the primary value for contiguous is
>> that it a foo[::1] can be interpreted as a foo*. Letting strides be
>> arbitrary completely breaks this, right?
>
> Nope. The natural definition of "C contiguous" is "the array entries
> are arranged in memory in the same way they would be if they were a
> multidimensional C array" (i.e., what you said.) But it turns out that
> this is *not* the definition that numpy and cython use!
>
> The issue is that the above definition is a constraint on the actual
> locations of items in memory, i.e., given a shape, it tells you that
> for every index,
>  (a)  sum(index * strides) == sum(index * cumprod(shape[::-1])[::-1] * itemsize)
> Obviously this equality holds if
>  (b)  strides == cumprod(shape[::-1])[::-1] * itemsize
> (Or for F-contiguity, we have
>  (b')  strides == cumprod(shape) * itemsize
> )
>
> (a) is the natural definition of "C contiguous". (b) is the definition
> of "C contiguous" used by numpy and cython. (b) implies (a). But (a)
> does not imply (b), i.e., there are arrays that are C-contiguous which
> numpy and cython think are discontiguous. (Also in numpy there are
> some weird cases where numpy accidentally uses the correct definition,
> I think, which is the point of Sebastian's example.)
>
> In particular, if shape[i] == 1, then the value of stride[i] really
> should be irrelevant to judging contiguity, because the only thing you
> can do with strides[i] is multiply it by index[i], and if shape[i] ==
> 1 then index[i] is always 0. So an array of int8's with shape = (10,
> 1), strides = (1, 73) is contiguous according to (a), but not
> according to (b). Also if shape[i] is 0 for any i, then the entire
> contents of the strides array becomes irrelevant to judging
> contiguity; all zero-sized arrays are contiguous according to (a), but
> not (b).

Thanks for clarifying.

Yes, I think it makes a lot of sense to loosen our definition for
Cython. Internally, I think the only way we use this assumption is in
not requiring that the first/final index be multiplied by the stride,
which should be totally fine. But this merits closer inspection as
there may be something else.

> (This is really annoying for numpy because given, say, a column vector
> with shape (n, 1), it is impossible to be both C- and F-contiguous
> according to the (b)-style definition. But people expect expect
> various operations to preserve C versus F contiguity, so there are
> heuristics in numpy that try to guess whether various result arrays
> should pretend to be C- or F-contiguous, and we don't even have a
> consistent idea of what it would mean for this code to be working
> correctly, never mind test it and keep it working. OTOH if we just fix
> numpy to use the (a) definition, then it turns out a bunch of
> third-party code breaks, like, for example, cython.)

Can you give some examples?

- Robert

From szport at gmail.com  Fri Mar  1 08:37:00 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Fri, 1 Mar 2013 10:37:00 +0300
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <CAPOE21TZopn873XsR1J8VCJ9gfyCpnEs9sFqXGQdBVLV7ZtJxQ@mail.gmail.com>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
	<CAPOE21TZopn873XsR1J8VCJ9gfyCpnEs9sFqXGQdBVLV7ZtJxQ@mail.gmail.com>
Message-ID: <CAPOE21QnJR2UURoBDcZVxqk4r2YxXxmJQwR48yGJKsf3_wfYPg@mail.gmail.com>

2013/3/1 ZS <szport at gmail.com>:
> 2013/3/1 Stefan Behnel <stefan_ml at behnel.de>:
>> ZS, 28.02.2013 21:07:
>>> 2013/2/28 Stefan Behnel:
>>>>> This allows to write unicode text parsing code almost at C speed
>>>>> mostly in python (+ .pxd defintions).
>>>>
>>>> I suggest simply adding a constant flag argument to the existing function
>>>> that states if checking should be done or not. Inlining will let the C
>>>> compiler drop the corresponding code, which may or may nor make it a little
>>>> faster.
>>>
>>> static inline Py_UCS4 unicode_char2(PyObject* ustring, Py_ssize_t i, int flag) {
>>>     Py_ssize_t length;
>>> #if CYTHON_PEP393_ENABLED
>>>     if (PyUnicode_READY(ustring) < 0) return (Py_UCS4)-1;
>>> #endif
>>>     if (flag) {
>>>         length = __Pyx_PyUnicode_GET_LENGTH(ustring);
>>>         if ((0 <= i) & (i < length)) {
>>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>>         } else if ((-length <= i) & (i < 0)) {
>>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i + length);
>>>         } else {
>>>             PyErr_SetString(PyExc_IndexError, "string index out of range");
>>>             return (Py_UCS4)-1;
>>>         }
>>>     } else {
>>>         return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>>     }
>>> }
>>
>> I think you could even pass in two flags, one for wraparound and one for
>> boundscheck, and then just evaluate them appropriately in the existing "if"
>> tests above. That should allow both features to be supported independently
>> in a fast way.
>>
>>
>>> Here are timings:
>>>
>>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>>> mytests.unicode_index import test_1" "test_1()"
>>> 50 loops, best of 5: 152 msec per loop
>>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>>> mytests.unicode_index import test_2" "test_2()"
>>> 50 loops, best of 5: 86.5 msec per loop
>>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>>> mytests.unicode_index import test_3" "test_3()"
>>> 50 loops, best of 5: 86.5 msec per loop
>>>
>>> So your suggestion would be preferable.
>>
>> Nice. Yes, looks like it' worth it.
>>
>
> Sure that same could be applied to unicode slicing too.
>
I had to verify myself first. So here is the test...

unicode_slice.h
---------------------

#include "unicodeobject.h"

static inline PyObject* unicode_slice(
            PyObject* text, Py_ssize_t start, Py_ssize_t stop);

/////////////// PyUnicode_Substring ///////////////

/* CURRENT */

static inline PyObject* unicode_slice(
            PyObject* text, Py_ssize_t start, Py_ssize_t stop) {
    Py_ssize_t length;
#if CYTHON_PEP393_ENABLED
    if (PyUnicode_READY(text) == -1) return NULL;
    length = PyUnicode_GET_LENGTH(text);
#else
    length = PyUnicode_GET_SIZE(text);
#endif
    if (start < 0) {
        start += length;
        if (start < 0)
            start = 0;
    }
    if (stop < 0)
        stop += length;
    else if (stop > length)
        stop = length;
    length = stop - start;
    if (length <= 0)
        return PyUnicode_FromUnicode(NULL, 0);
#if CYTHON_PEP393_ENABLED
    return PyUnicode_FromKindAndData(PyUnicode_KIND(text),
        PyUnicode_1BYTE_DATA(text) + start*PyUnicode_KIND(text), stop-start);
#else
    return PyUnicode_FromUnicode(PyUnicode_AS_UNICODE(text)+start, stop-start);
#endif
}

static inline PyObject* unicode_slice2(
            PyObject* text, Py_ssize_t start, Py_ssize_t stop, int flag);

/////////////// PyUnicode_Substring ///////////////

/* CHANGED */

static inline PyObject* unicode_slice2(
            PyObject* text, Py_ssize_t start, Py_ssize_t stop, int flag) {
    Py_ssize_t length;

#if CYTHON_PEP393_ENABLED
    if (PyUnicode_READY(text) == -1) return NULL;
#endif

if (flag) {
    #if CYTHON_PEP393_ENABLED
        length = PyUnicode_GET_LENGTH(text);
    #else
        length = PyUnicode_GET_SIZE(text);
    #endif
        if (start < 0) {
            start += length;
            if (start < 0)
                start = 0;
        }
        if (stop < 0)
            stop += length;
        else if (stop > length)
            stop = length;
        length = stop - start;
        if (length <= 0)
            return PyUnicode_FromUnicode(NULL, 0);
}

#if CYTHON_PEP393_ENABLED
    return PyUnicode_FromKindAndData(PyUnicode_KIND(text),
        PyUnicode_1BYTE_DATA(text) + start*PyUnicode_KIND(text), stop-start);
#else
    return PyUnicode_FromUnicode(PyUnicode_AS_UNICODE(text)+start, stop-start);
#endif
}

unicode_slice.pyx
------------------------

cdef extern from 'unicode_slice.h':
    inline unicode unicode_slice(unicode ustring, int start, int stop)
    inline unicode unicode_slice2(unicode ustring, int start, int
stop, int flag)

cdef unicode text = u"abcdefghigklmnopqrstuvwxyzabcdefghigklmnopqrstuvwxyz"

cdef long f_1(unicode text):
    cdef int i, j
    cdef int n = len(text)
    cdef int val
    cdef long S = 0

    for j in range(100000):
        for i in range(n):
            val = len(unicode_slice(text, 0, i))
            S += val * j

    return S

cdef long f_2(unicode text):
    cdef int i, j
    cdef int n = len(text)
    cdef int val
    cdef long S = 0

    for j in range(100000):
        for i in range(n):
            val = len(unicode_slice2(text, 0, i, 0))
            S += val * j

    return S


def test_1():
    f_1(text)

def test_2():
    f_2(text)

Here are timings:

(py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
mytests.unicode_slice import test_1" "test_1()"
50 loops, best of 5: 534 msec per loop
(py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
mytests.unicode_slice import test_2" "test_2()"
50 loops, best of 5: 523 msec per loop

Only 2%

Zaur Shibzukhov

From stefan_ml at behnel.de  Fri Mar  1 08:56:21 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 01 Mar 2013 08:56:21 +0100
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <CAPOE21QnJR2UURoBDcZVxqk4r2YxXxmJQwR48yGJKsf3_wfYPg@mail.gmail.com>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
	<CAPOE21TZopn873XsR1J8VCJ9gfyCpnEs9sFqXGQdBVLV7ZtJxQ@mail.gmail.com>
	<CAPOE21QnJR2UURoBDcZVxqk4r2YxXxmJQwR48yGJKsf3_wfYPg@mail.gmail.com>
Message-ID: <51305F25.5040805@behnel.de>

Zaur Shibzukhov, 01.03.2013 08:37:
> unicode_slice.h
> ---------------------
> 
> #include "unicodeobject.h"
> 
> static inline PyObject* unicode_slice(
>             PyObject* text, Py_ssize_t start, Py_ssize_t stop);
> 
> /////////////// PyUnicode_Substring ///////////////
> 
> /* CURRENT */
> 
> static inline PyObject* unicode_slice(
>             PyObject* text, Py_ssize_t start, Py_ssize_t stop) {
>     Py_ssize_t length;
> #if CYTHON_PEP393_ENABLED
>     if (PyUnicode_READY(text) == -1) return NULL;
>     length = PyUnicode_GET_LENGTH(text);
> #else
>     length = PyUnicode_GET_SIZE(text);
> #endif
>     if (start < 0) {
>         start += length;
>         if (start < 0)
>             start = 0;
>     }
>     if (stop < 0)
>         stop += length;
>     else if (stop > length)
>         stop = length;
>     length = stop - start;
>     if (length <= 0)
>         return PyUnicode_FromUnicode(NULL, 0);
> #if CYTHON_PEP393_ENABLED
>     return PyUnicode_FromKindAndData(PyUnicode_KIND(text),
>         PyUnicode_1BYTE_DATA(text) + start*PyUnicode_KIND(text), stop-start);
> #else
>     return PyUnicode_FromUnicode(PyUnicode_AS_UNICODE(text)+start, stop-start);
> #endif
> }
> 
> static inline PyObject* unicode_slice2(
>             PyObject* text, Py_ssize_t start, Py_ssize_t stop, int flag);
> 
> /////////////// PyUnicode_Substring ///////////////
> 
> /* CHANGED */
> 
> static inline PyObject* unicode_slice2(
>             PyObject* text, Py_ssize_t start, Py_ssize_t stop, int flag) {
>     Py_ssize_t length;
> 
> #if CYTHON_PEP393_ENABLED
>     if (PyUnicode_READY(text) == -1) return NULL;
> #endif
> 
> if (flag) {
>     #if CYTHON_PEP393_ENABLED
>         length = PyUnicode_GET_LENGTH(text);
>     #else
>         length = PyUnicode_GET_SIZE(text);
>     #endif
>         if (start < 0) {
>             start += length;
>             if (start < 0)
>                 start = 0;
>         }
>         if (stop < 0)
>             stop += length;
>         else if (stop > length)
>             stop = length;
>         length = stop - start;
>         if (length <= 0)
>             return PyUnicode_FromUnicode(NULL, 0);
> }
> 
> #if CYTHON_PEP393_ENABLED
>     return PyUnicode_FromKindAndData(PyUnicode_KIND(text),
>         PyUnicode_1BYTE_DATA(text) + start*PyUnicode_KIND(text), stop-start);
> #else
>     return PyUnicode_FromUnicode(PyUnicode_AS_UNICODE(text)+start, stop-start);
> #endif
> }
> 
> unicode_slice.pyx
> ------------------------
> 
> cdef extern from 'unicode_slice.h':
>     inline unicode unicode_slice(unicode ustring, int start, int stop)
>     inline unicode unicode_slice2(unicode ustring, int start, int
> stop, int flag)
> 
> cdef unicode text = u"abcdefghigklmnopqrstuvwxyzabcdefghigklmnopqrstuvwxyz"
> 
> cdef long f_1(unicode text):
>     cdef int i, j
>     cdef int n = len(text)
>     cdef int val
>     cdef long S = 0
> 
>     for j in range(100000):
>         for i in range(n):
>             val = len(unicode_slice(text, 0, i))
>             S += val * j
> 
>     return S
> 
> cdef long f_2(unicode text):
>     cdef int i, j
>     cdef int n = len(text)
>     cdef int val
>     cdef long S = 0
> 
>     for j in range(100000):
>         for i in range(n):
>             val = len(unicode_slice2(text, 0, i, 0))
>             S += val * j
> 
>     return S
> 
> 
> def test_1():
>     f_1(text)
> 
> def test_2():
>     f_2(text)
> 
> Here are timings:
> 
> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
> mytests.unicode_slice import test_1" "test_1()"
> 50 loops, best of 5: 534 msec per loop
> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
> mytests.unicode_slice import test_2" "test_2()"
> 50 loops, best of 5: 523 msec per loop
> 
> Only 2%

That's to be expected. Creating a Unicode string object is the highly
dominating operation here, including memory allocation, object type
selection and what not.

Stefan


From stefan_ml at behnel.de  Fri Mar  1 09:00:02 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 01 Mar 2013 09:00:02 +0100
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <CAPOE21Q10UHuMY0sQfsYc9kAHXAvsCp7ZNH-zNwAreu6=EgpLA@mail.gmail.com>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
	<CAPOE21ROETxOeDd2uk3JDVC=s8xiT9EoMhkZ4DyiU7vWNXW8Nw@mail.gmail.com>
	<51304EC6.9050300@behnel.de>
	<CAPOE21Q10UHuMY0sQfsYc9kAHXAvsCp7ZNH-zNwAreu6=EgpLA@mail.gmail.com>
Message-ID: <51306002.8000701@behnel.de>

Zaur Shibzukhov, 01.03.2013 07:54:
>>>> I think you could even pass in two flags, one for wraparound and one for
>>>> boundscheck, and then just evaluate them appropriately in the existing "if"
>>>> tests above. That should allow both features to be supported independently
>>>> in a fast way.
>>>>
>>> Intresting, could C compilers in optimization mode to eliminate unused
>>> evaluation path in nested if statements with constant conditional
>>> expressions?
>>
>> They'd be worthless if they didn't do that. (Even Cython does it, BTW.)
>>
> Then it can simplify writing utility code in order to support
> different optimization flags in other cases too.

Usually, yes. Look at the dict iteration code, for example, which makes
pretty heavy use of it.

This may not work in all cases, because the C compiler can decide to *not*
inline a function, for example, or may not be capable of cutting down the
code sufficiently in some specific cases.

I agree in general, but I wouldn't say that it's worth changing existing
(and working) code.

Stefan


From szport at gmail.com  Fri Mar  1 09:07:42 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Fri, 1 Mar 2013 11:07:42 +0300
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <51306002.8000701@behnel.de>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
	<CAPOE21ROETxOeDd2uk3JDVC=s8xiT9EoMhkZ4DyiU7vWNXW8Nw@mail.gmail.com>
	<51304EC6.9050300@behnel.de>
	<CAPOE21Q10UHuMY0sQfsYc9kAHXAvsCp7ZNH-zNwAreu6=EgpLA@mail.gmail.com>
	<51306002.8000701@behnel.de>
Message-ID: <CAPOE21QEG0hqG6oT2hs_52hrmeB39KaUcqijFNaiHzkS=2WvDw@mail.gmail.com>

>> Then it can simplify writing utility code in order to support
>> different optimization flags in other cases too.
>
> Usually, yes. Look at the dict iteration code, for example, which makes
> pretty heavy use of it.
>
> This may not work in all cases, because the C compiler can decide to *not*
> inline a function, for example, or may not be capable of cutting down the
> code sufficiently in some specific cases.
>
> I agree in general, but I wouldn't say that it's worth changing existing
> (and working) code.
>
Thus the strategy of specialization of code to handle special cases,
while maintaining the existing code, which works well in general, is
the preferred?

From robertwb at gmail.com  Fri Mar  1 09:25:15 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Fri, 1 Mar 2013 00:25:15 -0800
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <CAPOE21Q10UHuMY0sQfsYc9kAHXAvsCp7ZNH-zNwAreu6=EgpLA@mail.gmail.com>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
	<CAPOE21ROETxOeDd2uk3JDVC=s8xiT9EoMhkZ4DyiU7vWNXW8Nw@mail.gmail.com>
	<51304EC6.9050300@behnel.de>
	<CAPOE21Q10UHuMY0sQfsYc9kAHXAvsCp7ZNH-zNwAreu6=EgpLA@mail.gmail.com>
Message-ID: <CADiQ+QDgreHd90-rzMCiH8YBi4kB2VE_D-WAO16Xj0NPynVnQQ@mail.gmail.com>

On Thu, Feb 28, 2013 at 10:54 PM, Zaur Shibzukhov <szport at gmail.com> wrote:
>>>>
>>>> I think you could even pass in two flags, one for wraparound and one for
>>>> boundscheck, and then just evaluate them appropriately in the existing "if"
>>>> tests above. That should allow both features to be supported independently
>>>> in a fast way.
>>>>
>>> Intresting, could C compilers in optimization mode to eliminate unused
>>> evaluation path in nested if statements with constant conditional
>>> expressions?
>>
>> They'd be worthless if they didn't do that. (Even Cython does it, BTW.)
>>
> Then it can simplify writing utility code in order to support
> different optimization flags in other cases too.

The one thing you don't have much control over is whether the C
compiler will actually inline the function (CYTHON_INLINE is just a
hint). In particular, it may decide the function is too large to
inline before realizing how small it would become given the constant
arguments. I'm actually not sure how much of a problem this is in
practice...

- Robert

From szport at gmail.com  Fri Mar  1 10:46:39 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Fri, 1 Mar 2013 12:46:39 +0300
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <512FC919.4010702@behnel.de>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
Message-ID: <CAPOE21RXWxf0_ujTbS_pftU0Dk0x1UFwU+_NsGDpxFvo4JZJYQ@mail.gmail.com>

2013/3/1 Stefan Behnel <stefan_ml at behnel.de>:
> ZS, 28.02.2013 21:07:
>> 2013/2/28 Stefan Behnel:
>>>> This allows to write unicode text parsing code almost at C speed
>>>> mostly in python (+ .pxd defintions).
>>>
>>> I suggest simply adding a constant flag argument to the existing function
>>> that states if checking should be done or not. Inlining will let the C
>>> compiler drop the corresponding code, which may or may nor make it a little
>>> faster.
>>
>> static inline Py_UCS4 unicode_char2(PyObject* ustring, Py_ssize_t i, int flag) {
>>     Py_ssize_t length;
>> #if CYTHON_PEP393_ENABLED
>>     if (PyUnicode_READY(ustring) < 0) return (Py_UCS4)-1;
>> #endif
>>     if (flag) {
>>         length = __Pyx_PyUnicode_GET_LENGTH(ustring);
>>         if ((0 <= i) & (i < length)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>         } else if ((-length <= i) & (i < 0)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i + length);
>>         } else {
>>             PyErr_SetString(PyExc_IndexError, "string index out of range");
>>             return (Py_UCS4)-1;
>>         }
>>     } else {
>>         return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>     }
>> }
>
> I think you could even pass in two flags, one for wraparound and one for
> boundscheck, and then just evaluate them appropriately in the existing "if"
> tests above. That should allow both features to be supported independently
> in a fast way.
>
>
>> Here are timings:
>>
>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>> mytests.unicode_index import test_1" "test_1()"
>> 50 loops, best of 5: 152 msec per loop
>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>> mytests.unicode_index import test_2" "test_2()"
>> 50 loops, best of 5: 86.5 msec per loop
>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>> mytests.unicode_index import test_3" "test_3()"
>> 50 loops, best of 5: 86.5 msec per loop
>>
>> So your suggestion would be preferable.
>
> Nice. Yes, looks like it' worth it.
>
Could I help in order to include this in 19.0?

Zaur Shibzukhov

From stefan_ml at behnel.de  Fri Mar  1 11:47:59 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 01 Mar 2013 11:47:59 +0100
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <CAPOE21RXWxf0_ujTbS_pftU0Dk0x1UFwU+_NsGDpxFvo4JZJYQ@mail.gmail.com>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
	<CAPOE21RXWxf0_ujTbS_pftU0Dk0x1UFwU+_NsGDpxFvo4JZJYQ@mail.gmail.com>
Message-ID: <5130875F.7090406@behnel.de>

Zaur Shibzukhov, 01.03.2013 10:46:
> Could I help in order to include this in 19.0?

I like pull requests. ;)

Stefan


From szport at gmail.com  Fri Mar  1 11:54:47 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Fri, 1 Mar 2013 13:54:47 +0300
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <5130875F.7090406@behnel.de>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
	<CAPOE21RXWxf0_ujTbS_pftU0Dk0x1UFwU+_NsGDpxFvo4JZJYQ@mail.gmail.com>
	<5130875F.7090406@behnel.de>
Message-ID: <CAPOE21S42oKucCMKdUFeJw7P_9mAJzW_99yDCaPMnLOgJ-O7-w@mail.gmail.com>

2013/3/1 Stefan Behnel <stefan_ml at behnel.de>:
> Zaur Shibzukhov, 01.03.2013 10:46:
>> Could I help in order to include this in 19.0?
>
> I like pull requests. ;)
>
OK

From sebastian at sipsolutions.net  Fri Mar  1 16:56:39 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 01 Mar 2013 16:56:39 +0100
Subject: [Cython] Be more forgiving about memoryview strides
In-Reply-To: <CADiQ+QD_iAR96AS0iA4fVJWprosywN_SRdAM7XZhH5c9Bgjpfg@mail.gmail.com>
References: <1362064397.2663.14.camel@sebastian-laptop>
	<CADiQ+QA0vF1YLtDih5iD7gn4MGhPnnnX845mUx3+2cwhyxK9Sg@mail.gmail.com>
	<CAPJVwBnzWzRjv+tYtrv7D4rpN-Zhoj9X+WXp5H7N28p1i3-vLg@mail.gmail.com>
	<CADiQ+QD_iAR96AS0iA4fVJWprosywN_SRdAM7XZhH5c9Bgjpfg@mail.gmail.com>
Message-ID: <1362153399.13987.74.camel@sebastian-laptop>

On Thu, 2013-02-28 at 23:25 -0800, Robert Bradshaw wrote:
> On Thu, Feb 28, 2013 at 11:12 AM, Nathaniel Smith <njs at pobox.com> wrote:
> > On Thu, Feb 28, 2013 at 5:50 PM, Robert Bradshaw <robertwb at gmail.com> wrote:
> >> On Thu, Feb 28, 2013 at 7:13 AM, Sebastian Berg
> >> <sebastian at sipsolutions.net> wrote:
> >>> Hey,
> >>>
> >>> Maybe someone here already saw it (I don't have a track account, or I
> >>> would just create a ticket), but it would be nice if Cython was more
> >>> forgiving about contiguous requirements on strides. In the future this
> >>> would make it easier for numpy to go forward with changing the
> >>> contiguous flags to be more reasonable for its purpose, and second also
> >>> to allow old (and maybe for the moment remaining) corner cases in numpy
> >>> to slip past (as well as possibly the same for other programs...). An
> >>> example is (see also https://github.com/numpy/numpy/issues/2956 and the
> >>> PR linked there for more details):
> >>>
> >>> def add_one(array):
> >>>     cdef double[::1] a = array
> >>>     a[0] += 1.
> >>>     return array
> >>>
> >>> giving:
> >>>
> >>>>>> add_one(np.ascontiguousarray(np.arange(10.)[::100]))
> >>> ValueError: Buffer and memoryview are not contiguous in the same
> >>> dimension.
> >>>
> >>> This could easily be changed if MemoryViews check the strides as "can be
> >>> interpreted as contiguous". That means that if shape[i] == 1, then
> >>> strides[i] are arbitrary (you can just change them if you like). This is
> >>> also the case for 0-sized arrays, which are arguably always contiguous,
> >>> no matter their strides are!
> >>
> >> I was under the impression that the primary value for contiguous is
> >> that it a foo[::1] can be interpreted as a foo*. Letting strides be
> >> arbitrary completely breaks this, right?
> >
> > Nope. The natural definition of "C contiguous" is "the array entries
> > are arranged in memory in the same way they would be if they were a
> > multidimensional C array" (i.e., what you said.) But it turns out that
> > this is *not* the definition that numpy and cython use!
> >
> > The issue is that the above definition is a constraint on the actual
> > locations of items in memory, i.e., given a shape, it tells you that
> > for every index,
> >  (a)  sum(index * strides) == sum(index * cumprod(shape[::-1])[::-1] * itemsize)
> > Obviously this equality holds if
> >  (b)  strides == cumprod(shape[::-1])[::-1] * itemsize
> > (Or for F-contiguity, we have
> >  (b')  strides == cumprod(shape) * itemsize
> > )
> >
> > (a) is the natural definition of "C contiguous". (b) is the definition
> > of "C contiguous" used by numpy and cython. (b) implies (a). But (a)
> > does not imply (b), i.e., there are arrays that are C-contiguous which
> > numpy and cython think are discontiguous. (Also in numpy there are
> > some weird cases where numpy accidentally uses the correct definition,
> > I think, which is the point of Sebastian's example.)
> >
> > In particular, if shape[i] == 1, then the value of stride[i] really
> > should be irrelevant to judging contiguity, because the only thing you
> > can do with strides[i] is multiply it by index[i], and if shape[i] ==
> > 1 then index[i] is always 0. So an array of int8's with shape = (10,
> > 1), strides = (1, 73) is contiguous according to (a), but not
> > according to (b). Also if shape[i] is 0 for any i, then the entire
> > contents of the strides array becomes irrelevant to judging
> > contiguity; all zero-sized arrays are contiguous according to (a), but
> > not (b).
> 
> Thanks for clarifying.
> 
> Yes, I think it makes a lot of sense to loosen our definition for
> Cython. Internally, I think the only way we use this assumption is in
> not requiring that the first/final index be multiplied by the stride,
> which should be totally fine. But this merits closer inspection as
> there may be something else.

The only problem I saw was code that used strides[-1] instead of the
itemsize (e.g. using strides[i]/strides[-1] to then index the typed
buffer instead of using strides[i]/itemsize). But that should be easy to
check, numpy had two or so cases of that itself...

> 
> > (This is really annoying for numpy because given, say, a column vector
> > with shape (n, 1), it is impossible to be both C- and F-contiguous
> > according to the (b)-style definition. But people expect expect
> > various operations to preserve C versus F contiguity, so there are
> > heuristics in numpy that try to guess whether various result arrays
> > should pretend to be C- or F-contiguous, and we don't even have a
> > consistent idea of what it would mean for this code to be working
> > correctly, never mind test it and keep it working. OTOH if we just fix
> > numpy to use the (a) definition, then it turns out a bunch of
> > third-party code breaks, like, for example, cython.)
> 
> Can you give some examples?
> 

Not sure for what :). Maybe this is an example:

In [1]: a = np.asmatrix(np.arange(9).reshape(3,3).T)

In [2]: a.flags.f_contiguous
Out[2]: True

In [3]: a[:,0].flags
Out[3]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  ...

Where that view could just as well be F-contiguous, and the fact that
numpy, when in doubt, prefers C-contiguous might be surprising. And
since it would be less strict to begin with, numpy may safe a copy here
or there (without adding weird stride fixing code).

Examples for code breakage would be this check as well as scikit-learn
and scipy in 3 or 4 cases making the assumption above of itemsize ==
strides[-1] for c-contiguous arrays.

> - Robert
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel
> 


From robertwb at gmail.com  Fri Mar  1 21:17:27 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Fri, 1 Mar 2013 12:17:27 -0800
Subject: [Cython] Be more forgiving about memoryview strides
In-Reply-To: <1362153399.13987.74.camel@sebastian-laptop>
References: <1362064397.2663.14.camel@sebastian-laptop>
	<CADiQ+QA0vF1YLtDih5iD7gn4MGhPnnnX845mUx3+2cwhyxK9Sg@mail.gmail.com>
	<CAPJVwBnzWzRjv+tYtrv7D4rpN-Zhoj9X+WXp5H7N28p1i3-vLg@mail.gmail.com>
	<CADiQ+QD_iAR96AS0iA4fVJWprosywN_SRdAM7XZhH5c9Bgjpfg@mail.gmail.com>
	<1362153399.13987.74.camel@sebastian-laptop>
Message-ID: <CADiQ+QB8fZcPc1jevrKZF49xs02rSZtfnrsPN=PFmRT0YYiHzQ@mail.gmail.com>

On Fri, Mar 1, 2013 at 7:56 AM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> On Thu, 2013-02-28 at 23:25 -0800, Robert Bradshaw wrote:
>> On Thu, Feb 28, 2013 at 11:12 AM, Nathaniel Smith <njs at pobox.com> wrote:
>> > On Thu, Feb 28, 2013 at 5:50 PM, Robert Bradshaw <robertwb at gmail.com> wrote:
>> >> On Thu, Feb 28, 2013 at 7:13 AM, Sebastian Berg
>> >> <sebastian at sipsolutions.net> wrote:
>> >>> Hey,
>> >>>
>> >>> Maybe someone here already saw it (I don't have a track account, or I
>> >>> would just create a ticket), but it would be nice if Cython was more
>> >>> forgiving about contiguous requirements on strides. In the future this
>> >>> would make it easier for numpy to go forward with changing the
>> >>> contiguous flags to be more reasonable for its purpose, and second also
>> >>> to allow old (and maybe for the moment remaining) corner cases in numpy
>> >>> to slip past (as well as possibly the same for other programs...). An
>> >>> example is (see also https://github.com/numpy/numpy/issues/2956 and the
>> >>> PR linked there for more details):
>> >>>
>> >>> def add_one(array):
>> >>>     cdef double[::1] a = array
>> >>>     a[0] += 1.
>> >>>     return array
>> >>>
>> >>> giving:
>> >>>
>> >>>>>> add_one(np.ascontiguousarray(np.arange(10.)[::100]))
>> >>> ValueError: Buffer and memoryview are not contiguous in the same
>> >>> dimension.
>> >>>
>> >>> This could easily be changed if MemoryViews check the strides as "can be
>> >>> interpreted as contiguous". That means that if shape[i] == 1, then
>> >>> strides[i] are arbitrary (you can just change them if you like). This is
>> >>> also the case for 0-sized arrays, which are arguably always contiguous,
>> >>> no matter their strides are!
>> >>
>> >> I was under the impression that the primary value for contiguous is
>> >> that it a foo[::1] can be interpreted as a foo*. Letting strides be
>> >> arbitrary completely breaks this, right?
>> >
>> > Nope. The natural definition of "C contiguous" is "the array entries
>> > are arranged in memory in the same way they would be if they were a
>> > multidimensional C array" (i.e., what you said.) But it turns out that
>> > this is *not* the definition that numpy and cython use!
>> >
>> > The issue is that the above definition is a constraint on the actual
>> > locations of items in memory, i.e., given a shape, it tells you that
>> > for every index,
>> >  (a)  sum(index * strides) == sum(index * cumprod(shape[::-1])[::-1] * itemsize)
>> > Obviously this equality holds if
>> >  (b)  strides == cumprod(shape[::-1])[::-1] * itemsize
>> > (Or for F-contiguity, we have
>> >  (b')  strides == cumprod(shape) * itemsize
>> > )
>> >
>> > (a) is the natural definition of "C contiguous". (b) is the definition
>> > of "C contiguous" used by numpy and cython. (b) implies (a). But (a)
>> > does not imply (b), i.e., there are arrays that are C-contiguous which
>> > numpy and cython think are discontiguous. (Also in numpy there are
>> > some weird cases where numpy accidentally uses the correct definition,
>> > I think, which is the point of Sebastian's example.)
>> >
>> > In particular, if shape[i] == 1, then the value of stride[i] really
>> > should be irrelevant to judging contiguity, because the only thing you
>> > can do with strides[i] is multiply it by index[i], and if shape[i] ==
>> > 1 then index[i] is always 0. So an array of int8's with shape = (10,
>> > 1), strides = (1, 73) is contiguous according to (a), but not
>> > according to (b). Also if shape[i] is 0 for any i, then the entire
>> > contents of the strides array becomes irrelevant to judging
>> > contiguity; all zero-sized arrays are contiguous according to (a), but
>> > not (b).
>>
>> Thanks for clarifying.
>>
>> Yes, I think it makes a lot of sense to loosen our definition for
>> Cython. Internally, I think the only way we use this assumption is in
>> not requiring that the first/final index be multiplied by the stride,
>> which should be totally fine. But this merits closer inspection as
>> there may be something else.
>
> The only problem I saw was code that used strides[-1] instead of the
> itemsize (e.g. using strides[i]/strides[-1] to then index the typed
> buffer instead of using strides[i]/itemsize). But that should be easy to
> check, numpy had two or so cases of that itself...

I'd be surprised if we do that, but the only way to be sure would be
to look at the code.

>> > (This is really annoying for numpy because given, say, a column vector
>> > with shape (n, 1), it is impossible to be both C- and F-contiguous
>> > according to the (b)-style definition. But people expect expect
>> > various operations to preserve C versus F contiguity, so there are
>> > heuristics in numpy that try to guess whether various result arrays
>> > should pretend to be C- or F-contiguous, and we don't even have a
>> > consistent idea of what it would mean for this code to be working
>> > correctly, never mind test it and keep it working. OTOH if we just fix
>> > numpy to use the (a) definition, then it turns out a bunch of
>> > third-party code breaks, like, for example, cython.)
>>
>> Can you give some examples?
>>
>
> Not sure for what :).

I meant examples of possible breakage.

> Maybe this is an example:
>
> In [1]: a = np.asmatrix(np.arange(9).reshape(3,3).T)
>
> In [2]: a.flags.f_contiguous
> Out[2]: True
>
> In [3]: a[:,0].flags
> Out[3]:
>   C_CONTIGUOUS : True
>   F_CONTIGUOUS : False
>   ...
>
> Where that view could just as well be F-contiguous, and the fact that
> numpy, when in doubt, prefers C-contiguous might be surprising. And
> since it would be less strict to begin with, numpy may safe a copy here
> or there (without adding weird stride fixing code).
>
> Examples for code breakage would be this check as well as scikit-learn
> and scipy in 3 or 4 cases making the assumption above of itemsize ==
> strides[-1] for c-contiguous arrays.

Ah.

So, assuming Cython itself isn't making such assumptions, what support
do you want from Cython? I can see (1) accepting as c/f contiguous
arrays that meet this looser definition and (2) setting these flags in
memoryviews we produce under this looser definition. Is there anything
else?

- Robert

From sebastian at sipsolutions.net  Fri Mar  1 22:14:53 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 01 Mar 2013 22:14:53 +0100
Subject: [Cython] Be more forgiving about memoryview strides
In-Reply-To: <CADiQ+QB8fZcPc1jevrKZF49xs02rSZtfnrsPN=PFmRT0YYiHzQ@mail.gmail.com>
References: <1362064397.2663.14.camel@sebastian-laptop>
	<CADiQ+QA0vF1YLtDih5iD7gn4MGhPnnnX845mUx3+2cwhyxK9Sg@mail.gmail.com>
	<CAPJVwBnzWzRjv+tYtrv7D4rpN-Zhoj9X+WXp5H7N28p1i3-vLg@mail.gmail.com>
	<CADiQ+QD_iAR96AS0iA4fVJWprosywN_SRdAM7XZhH5c9Bgjpfg@mail.gmail.com>
	<1362153399.13987.74.camel@sebastian-laptop>
	<CADiQ+QB8fZcPc1jevrKZF49xs02rSZtfnrsPN=PFmRT0YYiHzQ@mail.gmail.com>
Message-ID: <1362172493.13987.137.camel@sebastian-laptop>

On Fri, 2013-03-01 at 12:17 -0800, Robert Bradshaw wrote:
> On Fri, Mar 1, 2013 at 7:56 AM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
> > On Thu, 2013-02-28 at 23:25 -0800, Robert Bradshaw wrote:
> >> On Thu, Feb 28, 2013 at 11:12 AM, Nathaniel Smith <njs at pobox.com> wrote:
> >> > On Thu, Feb 28, 2013 at 5:50 PM, Robert Bradshaw <robertwb at gmail.com> wrote:
> >> >> On Thu, Feb 28, 2013 at 7:13 AM, Sebastian Berg
> >> >> <sebastian at sipsolutions.net> wrote:
> >> >>> Hey,
> >> >>>
> >> >>> Maybe someone here already saw it (I don't have a track account, or I
> >> >>> would just create a ticket), but it would be nice if Cython was more
> >> >>> forgiving about contiguous requirements on strides. In the future this
> >> >>> would make it easier for numpy to go forward with changing the
> >> >>> contiguous flags to be more reasonable for its purpose, and second also
> >> >>> to allow old (and maybe for the moment remaining) corner cases in numpy
> >> >>> to slip past (as well as possibly the same for other programs...). An
> >> >>> example is (see also https://github.com/numpy/numpy/issues/2956 and the
> >> >>> PR linked there for more details):
> >> >>>
> >> >>> def add_one(array):
> >> >>>     cdef double[::1] a = array
> >> >>>     a[0] += 1.
> >> >>>     return array
> >> >>>
> >> >>> giving:
> >> >>>
> >> >>>>>> add_one(np.ascontiguousarray(np.arange(10.)[::100]))
> >> >>> ValueError: Buffer and memoryview are not contiguous in the same
> >> >>> dimension.
> >> >>>
> >> >>> This could easily be changed if MemoryViews check the strides as "can be
> >> >>> interpreted as contiguous". That means that if shape[i] == 1, then
> >> >>> strides[i] are arbitrary (you can just change them if you like). This is
> >> >>> also the case for 0-sized arrays, which are arguably always contiguous,
> >> >>> no matter their strides are!
> >> >>
> >> >> I was under the impression that the primary value for contiguous is
> >> >> that it a foo[::1] can be interpreted as a foo*. Letting strides be
> >> >> arbitrary completely breaks this, right?
> >> >
> >> > Nope. The natural definition of "C contiguous" is "the array entries
> >> > are arranged in memory in the same way they would be if they were a
> >> > multidimensional C array" (i.e., what you said.) But it turns out that
> >> > this is *not* the definition that numpy and cython use!
> >> >
> >> > The issue is that the above definition is a constraint on the actual
> >> > locations of items in memory, i.e., given a shape, it tells you that
> >> > for every index,
> >> >  (a)  sum(index * strides) == sum(index * cumprod(shape[::-1])[::-1] * itemsize)
> >> > Obviously this equality holds if
> >> >  (b)  strides == cumprod(shape[::-1])[::-1] * itemsize
> >> > (Or for F-contiguity, we have
> >> >  (b')  strides == cumprod(shape) * itemsize
> >> > )
> >> >
> >> > (a) is the natural definition of "C contiguous". (b) is the definition
> >> > of "C contiguous" used by numpy and cython. (b) implies (a). But (a)
> >> > does not imply (b), i.e., there are arrays that are C-contiguous which
> >> > numpy and cython think are discontiguous. (Also in numpy there are
> >> > some weird cases where numpy accidentally uses the correct definition,
> >> > I think, which is the point of Sebastian's example.)
> >> >
> >> > In particular, if shape[i] == 1, then the value of stride[i] really
> >> > should be irrelevant to judging contiguity, because the only thing you
> >> > can do with strides[i] is multiply it by index[i], and if shape[i] ==
> >> > 1 then index[i] is always 0. So an array of int8's with shape = (10,
> >> > 1), strides = (1, 73) is contiguous according to (a), but not
> >> > according to (b). Also if shape[i] is 0 for any i, then the entire
> >> > contents of the strides array becomes irrelevant to judging
> >> > contiguity; all zero-sized arrays are contiguous according to (a), but
> >> > not (b).
> >>
> >> Thanks for clarifying.
> >>
> >> Yes, I think it makes a lot of sense to loosen our definition for
> >> Cython. Internally, I think the only way we use this assumption is in
> >> not requiring that the first/final index be multiplied by the stride,
> >> which should be totally fine. But this merits closer inspection as
> >> there may be something else.
> >
> > The only problem I saw was code that used strides[-1] instead of the
> > itemsize (e.g. using strides[i]/strides[-1] to then index the typed
> > buffer instead of using strides[i]/itemsize). But that should be easy to
> > check, numpy had two or so cases of that itself...
> 
> I'd be surprised if we do that, but the only way to be sure would be
> to look at the code.
> 
> >> > (This is really annoying for numpy because given, say, a column vector
> >> > with shape (n, 1), it is impossible to be both C- and F-contiguous
> >> > according to the (b)-style definition. But people expect expect
> >> > various operations to preserve C versus F contiguity, so there are
> >> > heuristics in numpy that try to guess whether various result arrays
> >> > should pretend to be C- or F-contiguous, and we don't even have a
> >> > consistent idea of what it would mean for this code to be working
> >> > correctly, never mind test it and keep it working. OTOH if we just fix
> >> > numpy to use the (a) definition, then it turns out a bunch of
> >> > third-party code breaks, like, for example, cython.)
> >>
> >> Can you give some examples?
> >>
> >
> > Not sure for what :).
> 
> I meant examples of possible breakage.
> 
> > Maybe this is an example:
> >
> > In [1]: a = np.asmatrix(np.arange(9).reshape(3,3).T)
> >
> > In [2]: a.flags.f_contiguous
> > Out[2]: True
> >
> > In [3]: a[:,0].flags
> > Out[3]:
> >   C_CONTIGUOUS : True
> >   F_CONTIGUOUS : False
> >   ...
> >
> > Where that view could just as well be F-contiguous, and the fact that
> > numpy, when in doubt, prefers C-contiguous might be surprising. And
> > since it would be less strict to begin with, numpy may safe a copy here
> > or there (without adding weird stride fixing code).
> >
> > Examples for code breakage would be this check as well as scikit-learn
> > and scipy in 3 or 4 cases making the assumption above of itemsize ==
> > strides[-1] for c-contiguous arrays.
> 
> Ah.
> 
> So, assuming Cython itself isn't making such assumptions, what support
> do you want from Cython? I can see (1) accepting as c/f contiguous
> arrays that meet this looser definition and (2) setting these flags in
> memoryviews we produce under this looser definition. Is there anything
> else?
> 

Just accepting it would be cool. I am not aware that (2) would matter
for numpy, so just do whatever you feel best. I doubt numpy will change
them in a release version any time soon, but will be nice to know that
it can without breaking cython based code!

I am wondering if there is a way to work around/warn users doing this
(this is what sk-learn had):

    cdef np.ndarray[ndim=2, mode='c'] a = array
    step = a.strides[0]/a.strides[1]
    # Then using a.data[step]

but I am not sure. I first thought that if it is easy, you could point
a.strides to the buffers strides, allowing numpy to fix those. But just
realized that it would be weird since ndarray.strides is an attribute
that can be set.
And since as I understand this is discouraged already, it is probably
not worth it to think about it much.

- Sebastian

> - Robert
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel
> 


From nikita at nemkin.ru  Sat Mar  2 07:52:44 2013
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Sat, 02 Mar 2013 12:52:44 +0600
Subject: [Cython] Two minor bugs
Message-ID: <op.wta316f7yabthb@juga>

Hi,

I'm new to this list and to Cython internals.

Reporting two recently found bugs:

1. Explicit <bytes> cast fails unexpectedly:

        ctypedef char* LPSTR
        cdef LPSTR c_str = b"ascii"
        <bytes>c_str  # Failure: Python objects cannot be cast from  
pointers of primitive types

    The problem is CTypedefType not delegating can_coerce_to_pyobject() to  
the original type.
    (because BaseType.can_coerce_to_pyobject takes precedence over  
__getattr__).
    Patch+test case and attached.

    Interestingly, implicit casts use a different code path and are not  
affected.

    There is potential for similar bugs in the future, because __getattr__
    delegation is inherently brittle in the presence of the base class  
(BaseType).

2. This recently added code does not compile with MSVC:
    https://github.com/cython/cython/blob/master/Cython/Utility/TypeConversion.c#L140-142
    Interleaving declarations and statements is not allowed in C90...


Best Regards,
Nikita Nemkin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Fixed-explicit-coercion-of-ctypedef-ed-C-types.patch
Type: application/octet-stream
Size: 2293 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130302/2226d1fb/attachment.obj>

From stefan_ml at behnel.de  Sat Mar  2 11:52:34 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 02 Mar 2013 11:52:34 +0100
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <CADiQ+QDgreHd90-rzMCiH8YBi4kB2VE_D-WAO16Xj0NPynVnQQ@mail.gmail.com>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
	<CAPOE21ROETxOeDd2uk3JDVC=s8xiT9EoMhkZ4DyiU7vWNXW8Nw@mail.gmail.com>
	<51304EC6.9050300@behnel.de>
	<CAPOE21Q10UHuMY0sQfsYc9kAHXAvsCp7ZNH-zNwAreu6=EgpLA@mail.gmail.com>
	<CADiQ+QDgreHd90-rzMCiH8YBi4kB2VE_D-WAO16Xj0NPynVnQQ@mail.gmail.com>
Message-ID: <5131D9F2.5070608@behnel.de>

Robert Bradshaw, 01.03.2013 09:25:
> On Thu, Feb 28, 2013 at 10:54 PM, Zaur Shibzukhov wrote:
>>>>> I think you could even pass in two flags, one for wraparound and one for
>>>>> boundscheck, and then just evaluate them appropriately in the existing "if"
>>>>> tests above. That should allow both features to be supported independently
>>>>> in a fast way.
>>>>>
>>>> Intresting, could C compilers in optimization mode to eliminate unused
>>>> evaluation path in nested if statements with constant conditional
>>>> expressions?
>>>
>>> They'd be worthless if they didn't do that. (Even Cython does it, BTW.)
>>>
>> Then it can simplify writing utility code in order to support
>> different optimization flags in other cases too.
> 
> The one thing you don't have much control over is whether the C
> compiler will actually inline the function (CYTHON_INLINE is just a
> hint). In particular, it may decide the function is too large to
> inline before realizing how small it would become given the constant
> arguments. I'm actually not sure how much of a problem this is in
> practice...

I tried it out for the Get/Set/DelItemInt() utility functions and took a
look at the generated assembly (gcc -O3). It does look as expected and
sometimes also better than what we currently generate. So I think it's
worth it.

https://github.com/scoder/cython/commit/cc4f7daec3b1f19b5acaed7766e2b6f86902ad94

I'd be happy if someone else could give this change a review to make sure I
got all conditions right.

Stefan


From stefan_ml at behnel.de  Sat Mar  2 11:56:13 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 02 Mar 2013 11:56:13 +0100
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <512FC919.4010702@behnel.de>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de>
Message-ID: <5131DACD.6050402@behnel.de>

Stefan Behnel, 28.02.2013 22:16:
> ZS, 28.02.2013 21:07:
>> 2013/2/28 Stefan Behnel:
>>>> This allows to write unicode text parsing code almost at C speed
>>>> mostly in python (+ .pxd defintions).
>>>
>>> I suggest simply adding a constant flag argument to the existing function
>>> that states if checking should be done or not. Inlining will let the C
>>> compiler drop the corresponding code, which may or may nor make it a little
>>> faster.
>>
>> static inline Py_UCS4 unicode_char2(PyObject* ustring, Py_ssize_t i, int flag) {
>>     Py_ssize_t length;
>> #if CYTHON_PEP393_ENABLED
>>     if (PyUnicode_READY(ustring) < 0) return (Py_UCS4)-1;
>> #endif
>>     if (flag) {
>>         length = __Pyx_PyUnicode_GET_LENGTH(ustring);
>>         if ((0 <= i) & (i < length)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>         } else if ((-length <= i) & (i < 0)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i + length);
>>         } else {
>>             PyErr_SetString(PyExc_IndexError, "string index out of range");
>>             return (Py_UCS4)-1;
>>         }
>>     } else {
>>         return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>     }
>> }
> 
> I think you could even pass in two flags, one for wraparound and one for
> boundscheck, and then just evaluate them appropriately in the existing "if"
> tests above. That should allow both features to be supported independently
> in a fast way.

Done.

https://github.com/scoder/cython/commit/cc4f7daec3b1f19b5acaed7766e2b6f86902ad94

Stefan


From stefan_ml at behnel.de  Sat Mar  2 12:15:50 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 02 Mar 2013 12:15:50 +0100
Subject: [Cython] To Add datetime.pxd to cython.cpython
In-Reply-To: <CAPOE21SkWaQMYhA_b9zuzC--MugcdvrBZidQXHK1+56xf4GUgQ@mail.gmail.com>
References: <CAPOE21SkWaQMYhA_b9zuzC--MugcdvrBZidQXHK1+56xf4GUgQ@mail.gmail.com>
Message-ID: <5131DF66.6030403@behnel.de>

Hi,

the last pull request looks good to me now.

https://github.com/cython/cython/pull/189

Any more comments on it?

Stefan


From szport at gmail.com  Sat Mar  2 18:55:46 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Sat, 2 Mar 2013 20:55:46 +0300
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <5131DACD.6050402@behnel.de>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de> <5131DACD.6050402@behnel.de>
Message-ID: <CAPOE21RCGXVeXt8uGL2qamGsAYnqtGxqx1fv67X4BA94_g+4xA@mail.gmail.com>

2013/3/2 Stefan Behnel <stefan_ml at behnel.de>:
>> I think you could even pass in two flags, one for wraparound and one for
>> boundscheck, and then just evaluate them appropriately in the existing "if"
>> tests above. That should allow both features to be supported independently
>> in a fast way.
>
> https://github.com/scoder/cython/commit/cc4f7daec3b1f19b5acaed7766e2b6f86902ad94

It seems to include the following directive at the beginning of the
tests (which tests indices for lists, tuples and unicode):

#cython: boundscheck=True
#cython: wraparound=True

as default mode for testing?

-- 
? ?????????,
???????? ?.?.

From stefan_ml at behnel.de  Sat Mar  2 19:47:01 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 02 Mar 2013 19:47:01 +0100
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <CAPOE21RCGXVeXt8uGL2qamGsAYnqtGxqx1fv67X4BA94_g+4xA@mail.gmail.com>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de> <5131DACD.6050402@behnel.de>
	<CAPOE21RCGXVeXt8uGL2qamGsAYnqtGxqx1fv67X4BA94_g+4xA@mail.gmail.com>
Message-ID: <51324925.50901@behnel.de>

Zaur Shibzukhov, 02.03.2013 18:55:
> 2013/3/2 Stefan Behnel:
>>> I think you could even pass in two flags, one for wraparound and one for
>>> boundscheck, and then just evaluate them appropriately in the existing "if"
>>> tests above. That should allow both features to be supported independently
>>> in a fast way.
>>
>> https://github.com/scoder/cython/commit/cc4f7daec3b1f19b5acaed7766e2b6f86902ad94
> 
> It seems to include the following directive at the beginning of the
> tests (which tests indices for lists, tuples and unicode):
> 
> #cython: boundscheck=True
> #cython: wraparound=True
> 
> as default mode for testing?

Yes, although they would appear redundant here.

Stefan


From nikita at nemkin.ru  Sun Mar  3 08:39:50 2013
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Sun, 03 Mar 2013 13:39:50 +0600
Subject: [Cython] Py_UNICODE* string support
Message-ID: <op.wtc0wo10yabthb@juga>

Hi,

Please review my feature proposal to add Py_UNICODE* string support
for better Windows interoperability:
https://github.com/cython/cython/pull/191

This is motivated by my current work that involves calling lots of Windows  
APIs.

If people are interested I can elaborate on some important points, like
the choice of base type (Py_UNICODE vs wchar_t) or the nature of
Py_UNICODE* literals or why this feature is necessary at all.


Best regards,
Nikita Nemkin

From robertwb at gmail.com  Sun Mar  3 08:45:53 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Sat, 2 Mar 2013 23:45:53 -0800
Subject: [Cython] Two minor bugs
In-Reply-To: <op.wta316f7yabthb@juga>
References: <op.wta316f7yabthb@juga>
Message-ID: <CADiQ+QDLmSqQdi2W4ZJ+jdnNGCeH+fgn3jY8PQsbO+s76YBhXQ@mail.gmail.com>

On Fri, Mar 1, 2013 at 10:52 PM, Nikita Nemkin <nikita at nemkin.ru> wrote:
> Hi,
>
> I'm new to this list and to Cython internals.
>
> Reporting two recently found bugs:
>
> 1. Explicit <bytes> cast fails unexpectedly:
>
>        ctypedef char* LPSTR
>        cdef LPSTR c_str = b"ascii"
>        <bytes>c_str  # Failure: Python objects cannot be cast from pointers
> of primitive types
>
>    The problem is CTypedefType not delegating can_coerce_to_pyobject() to
> the original type.
>    (because BaseType.can_coerce_to_pyobject takes precedence over
> __getattr__).
>    Patch+test case and attached.

Thanks! Applied.

>    Interestingly, implicit casts use a different code path and are not
> affected.
>
>    There is potential for similar bugs in the future, because __getattr__
>    delegation is inherently brittle in the presence of the base class
> (BaseType).

Yes, very true.

> 2. This recently added code does not compile with MSVC:
>
> https://github.com/cython/cython/blob/master/Cython/Utility/TypeConversion.c#L140-142
>    Interleaving declarations and statements is not allowed in C90...

Fixed https://github.com/cython/cython/commit/24f56e14194e14c706beb6d0ee58a58e77b0b03e

- Robert

From stefan_ml at behnel.de  Sun Mar  3 08:52:49 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 03 Mar 2013 08:52:49 +0100
Subject: [Cython] Py_UNICODE* string support
In-Reply-To: <op.wtc0wo10yabthb@juga>
References: <op.wtc0wo10yabthb@juga>
Message-ID: <51330151.5080300@behnel.de>

Nikita Nemkin, 03.03.2013 08:39:
> Please review my feature proposal to add Py_UNICODE* string support
> for better Windows interoperability:
> https://github.com/cython/cython/pull/191
> 
> This is motivated by my current work that involves calling lots of Windows
> APIs.
> 
> If people are interested I can elaborate on some important points, like
> the choice of base type (Py_UNICODE vs wchar_t) or the nature of
> Py_UNICODE* literals or why this feature is necessary at all.

Are you aware that Py_UNICODE is deprecated as of Py3.3?

http://docs.python.org/3.4/c-api/unicode.html

Your changes look a bit excessive for supporting something that's
inefficient in recent Python versions and basically "dead".

Stefan


From nikita at nemkin.ru  Sun Mar  3 09:25:33 2013
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Sun, 03 Mar 2013 14:25:33 +0600
Subject: [Cython] Py_UNICODE* string support
In-Reply-To: <51330151.5080300@behnel.de>
References: <op.wtc0wo10yabthb@juga> <51330151.5080300@behnel.de>
Message-ID: <op.wtc20vlsyabthb@juga>

On Sun, 03 Mar 2013 13:52:49 +0600, Stefan Behnel <stefan_ml at behnel.de>  
wrote:

> Are you aware that Py_UNICODE is deprecated as of Py3.3?
>
> http://docs.python.org/3.4/c-api/unicode.html
>
> Your changes look a bit excessive for supporting something that's
> inefficient in recent Python versions and basically "dead".

Yes, I'm well aware of Py3.3 changes, but consider this:

1. _All_ system APIs on Windows, old, new and in-between, use UTF-16 in  
the form of
    zero-terminated 2-byte wchar_t* strings (on Windows Py_UNICODE is  
_always_ aliased
    to wchar_t specifically for this reason).
    Whatever happens to Python internals, the need to interoperate with  
UTF-16 based
    platforms won't go away.

2. PY_UNICODE family of APIs remains the recommended way to interoperate  
with Windows.
    (So said the autor of PEP393 himself, I could find the relevant  
discussion in python-dev.)

3. It is not _that_ inefficient. Actually, it has the same efficiency as  
the UTF8-related APIs
    (which have to be used on UTF-8 platforms like most *nix systems).

    UTF8 allows sharing of ASCII buffer and has to convert USC2/UCS4,
    Py_UNICODE shares UCS2 buffer (assuming narrow build) and has to  
convert ASCII.


One alternative to Py_UNICODE that I have rejected is using Python's  
wchar_t support.
It's practicaly useless for these reasons:
1) wchar_t APIs do not exist in Py2 and have to be implemented for  
compatibility.
2) Implementing them brings in all the pain of nonportable wchar_t type
    (on *nix systems in general), whereas it's the primary users would  
target Windows,
    where (pretty horrible) wchar_t portability workarounds would be dead  
code.
3) wchar_t APIs do not offer a zero-copy option and do not manage the  
memory for us.


The changes are some 50 lines of code, not counting the tests. I wouldn't  
call that excessive.
And they mostly mirror existing code, no trickery of any kind.

Inbuilt Py_UNICODE* support also means that the users would be shielded  
 from 3.3 changes
and Cython is free to optimize sting handling in the future.
Believe me, nobody calls Py_UNICODE APIs because they want to, they just  
have to.


Best regards,
Nikita Nemkin

From stefan_ml at behnel.de  Sun Mar  3 10:32:36 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 03 Mar 2013 10:32:36 +0100
Subject: [Cython] Py_UNICODE* string support
In-Reply-To: <op.wtc20vlsyabthb@juga>
References: <op.wtc0wo10yabthb@juga> <51330151.5080300@behnel.de>
	<op.wtc20vlsyabthb@juga>
Message-ID: <513318B4.3080803@behnel.de>

Nikita Nemkin, 03.03.2013 09:25:
> On Sun, 03 Mar 2013 13:52:49 +0600, Stefan Behnel wrote:
>> Are you aware that Py_UNICODE is deprecated as of Py3.3?
>>
>> http://docs.python.org/3.4/c-api/unicode.html
>>
>> Your changes look a bit excessive for supporting something that's
>> inefficient in recent Python versions and basically "dead".
> 
> Yes, I'm well aware of Py3.3 changes, but consider this:
> 
> 1. _All_ system APIs on Windows, old, new and in-between, use UTF-16 in the
>    form of zero-terminated 2-byte wchar_t* strings (on Windows Py_UNICODE is
>    _always_ aliased to wchar_t specifically for this reason).
>    Whatever happens to Python internals, the need to interoperate with
>    UTF-16 based platforms won't go away.

Ok, fine with me.

Your changes look fairly reasonable, especially for a first try. I have the
following comments.

1) I would like to get rid of UnicodeConst. A Py_UNICODE* is not different
from any other C array, except that it can coerce to and from Unicode
strings. So the representation of a literal should be a (properly reference
counted) Python Unicode object, and users would be allowed to cast them to
<Py_UNICODE*>, just as we support it for <char*> and bytes.

2) non-BMP literals should be supported by representing them as normal
Unicode strings and creating the Py_UNICODE representation at need (i.e.
explicitly through a cast, at runtime). Py_UNICODE[] literals are simply
not portable.

3) __Pyx_Py_UNICODE_strlen() is ok, but only for the special case that all
we have is a Py_UNICODE*. As long as we are dealing with Unicode string
objects, that won't be needed, so len() should be constant time in the
normal case instead of linear time.

4) most of the changes in PyrexTypes.py and ExprNodes.py look ok. I would
eventually like to see a couple of refactorings on these sections (because
the special cases add up over time), but that's not required for this change.

So, the basic idea would be to use Unicode strings and their (optional)
internal representation as Py_UNICODE[] instead of making Py_UNICODE[] a
first class data type. And then go from there and optimise certain things
to use the unpacked array directly, so that users won't need to put
explicit C-API calls into their code.

Stefan


From szport at gmail.com  Sun Mar  3 10:49:11 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Sun, 3 Mar 2013 12:49:11 +0300
Subject: [Cython] About IndexNode and unicode[index]
In-Reply-To: <5131DACD.6050402@behnel.de>
References: <CAPOE21Ti03-WWad2jEbHMr5029NqrP6zrj-xrZaiigMq8UT+uw@mail.gmail.com>
	<CAPOE21Sa7fvs2JjpO+CORNnWFM=Fiqb3M07nUNdwxaL9wD=dCg@mail.gmail.com>
	<512FAF8C.7020008@behnel.de>
	<CAPOE21SMRYFYnR8AmHyrrgFabHX0RG0T=jn-29QZix0kfBxGOw@mail.gmail.com>
	<512FC919.4010702@behnel.de> <5131DACD.6050402@behnel.de>
Message-ID: <CAPOE21TFUCr+RmHDfdq0P8GUvu2uG6YVs6z2RFv4M_5pQrMeWw@mail.gmail.com>

2013/3/2 Stefan Behnel <stefan_ml at behnel.de>:
> Stefan Behnel, 28.02.2013 22:16:
>
> https://github.com/scoder/cython/commit/cc4f7daec3b1f19b5acaed7766e2b6f86902ad94
>
> Stefan
>
I tried to build with that change. Tests `unicode_indexing` and
`index` are passed.

From nikita at nemkin.ru  Sun Mar  3 14:40:54 2013
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Sun, 03 Mar 2013 19:40:54 +0600
Subject: [Cython] Py_UNICODE* string support
In-Reply-To: <513318B4.3080803@behnel.de>
References: <op.wtc0wo10yabthb@juga> <51330151.5080300@behnel.de>
	<op.wtc20vlsyabthb@juga> <513318B4.3080803@behnel.de>
Message-ID: <op.wtdhmgbpyabthb@juga>

On Sun, 03 Mar 2013 15:32:36 +0600, Stefan Behnel <stefan_ml at behnel.de>  
wrote:

> 1) I would like to get rid of UnicodeConst. A Py_UNICODE* is not  
> different
> from any other C array, except that it can coerce to and from Unicode
> strings. So the representation of a literal should be a (properly
> reference
> counted) Python Unicode object, and users would be allowed to cast them
> to <Py_UNICODE*>, just as we support it for <char*> and bytes.

I understand the idea. Since Python unicode literals are implicitly
coercible to Py_UNICODE*, there appears to be no need for C-level
Py_UNICODE[] literals. Indeed, client code will look exactly (!) the same
whether they are supported or not.

Except when it comes to nogil. (For example, native callbacks are almost
guaranteed to be nogil.) Hiding Python operations in what appears to be
pure C-level code will break users' assumptions.
This is #1 reason why I went for C-level literals. #2 reason is efficiency
on Py3.3. C-level literals don't need conversions and don't call any  
conversion APIs.

> 2) non-BMP literals should be supported by representing them as normal
> Unicode strings and creating the Py_UNICODE representation at need (i.e.
> explicitly through a cast, at runtime). Py_UNICODE[] literals are simply
> not portable.

Py_UNICODE[] literals can be made fully portable if non-BMP ones are  
wrapped
like this:

    #ifdef Py_UNICODE_WIDE
    static const k_xxx[] = { <UTF-32 array without surrogates>, 0 };
    #else
    static const k_xxx[] = { <UTF-16 array with surrogates>, 0 };
    #endif

Literals containing only BMP chars are already portable and don't need
this wrapping.

> 3) __Pyx_Py_UNICODE_strlen() is ok, but only for the special case that  
> all we have is a Py_UNICODE*. As long as we are dealing with Unicode  
> string
> objects, that won't be needed, so len() should be constant time in the
> normal case instead of linear time.

len(Py_UNICODE*) simply mirrors len(char*). Its putpose is to provide
platform-independent Py_UNICODE_strlen (which is Py3 only and deprecated  
in 3.3).

> So, the basic idea would be to use Unicode strings and their (optional)
> internal representation as Py_UNICODE[] instead of making Py_UNICODE[] a
> first class data type. And then go from there and optimise certain things
> to use the unpacked array directly, so that users won't need to put
> explicit C-API calls into their code.

Please reconsider your decision wrt C-level literals.
I believe that nogil code and a bit of efficiency (on 3.3) justify their
existence. (char* literals do have C-level literals, Py_UNICODE* is in
the same basket when it comes to Windows code).
The code to support them is also small and well-contained.
I've updated my pull request to fully support for non-BMP Py_UNICODE[]  
literals.

If you are still not convinced, so be it, I'll drop C-level literal  
support.


Best regards,
Nikita Nemkin


PS. I made a false claim in the previous mail. (Some of) Python's wchar_t  
APIs
do exist in Py2. But they won't manage the memory automatically anyway.

From szport at gmail.com  Sun Mar  3 15:52:10 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Sun, 3 Mar 2013 17:52:10 +0300
Subject: [Cython] To Add datetime.pxd to cython.cpython
In-Reply-To: <5131DF66.6030403@behnel.de>
References: <CAPOE21SkWaQMYhA_b9zuzC--MugcdvrBZidQXHK1+56xf4GUgQ@mail.gmail.com>
	<5131DF66.6030403@behnel.de>
Message-ID: <CAPOE21RJYF6kVw26RHXk5-QzBCcvjmaXq8twac+Aqe4Gx_xjCg@mail.gmail.com>

2013/3/2 Stefan Behnel <stefan_ml at behnel.de>:
> Hi,
>
> the last pull request looks good to me now.
>
> https://github.com/cython/cython/pull/189
>
> Any more comments on it?
>
> Stefan
>

As was suggested earlier, I added `import_datetime` inline function to
initialize PyDateTime C API instead of direct usage of "non-native" C
macros from datetime.h.
Now you call `import_array ()` first in the same way as is done with `numpy`.
 This approach looks natural in the light of experience with numpy.


Zaur Shibzukhov

From szport at gmail.com  Sun Mar  3 20:11:42 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Sun, 3 Mar 2013 22:11:42 +0300
Subject: [Cython] To Add datetime.pxd to cython.cpython
In-Reply-To: <CAPOE21RJYF6kVw26RHXk5-QzBCcvjmaXq8twac+Aqe4Gx_xjCg@mail.gmail.com>
References: <CAPOE21SkWaQMYhA_b9zuzC--MugcdvrBZidQXHK1+56xf4GUgQ@mail.gmail.com>
	<5131DF66.6030403@behnel.de>
	<CAPOE21RJYF6kVw26RHXk5-QzBCcvjmaXq8twac+Aqe4Gx_xjCg@mail.gmail.com>
Message-ID: <CAPOE21Sw-7reRg+e53Mq6Ad5JnL12mdX+982yJZBAK4tyHfFXg@mail.gmail.com>

2013/3/3 Zaur Shibzukhov <szport at gmail.com>:
> 2013/3/2 Stefan Behnel <stefan_ml at behnel.de>:
>> Hi,
>>
>> the last pull request looks good to me now.
>>
>> https://github.com/cython/cython/pull/189
>>
>> Any more comments on it?
>
> As was suggested earlier, I added `import_datetime` inline function to
> initialize PyDateTime C API instead of direct usage of "non-native" C
> macros from datetime.h.
> Now you call `import_array ()` first in the same way as is done with `numpy`.
>  This approach looks natural in the light of experience with numpy.
>
 I make some performance comparisons. Here example for dates.

# test_date.pyx
--------------------

Here test code:

from cpython.datetime cimport import_datetime, date_new, date

import_datetime()

from datetime import date as pydate

def test_date1():
    cdef list lst = []
    for year in range(1000, 2001):
        for month in range(1,13):
            for day in range(1, 20):
                d = pydate(year, month, day)
                lst.append(d)
    return lst


def test_date2():
    cdef list lst = []
    for year in range(1000, 2001):
        for month in range(1,13):
            for day in range(1, 20):
                d = date(year, month, day)
                lst.append(d)
    return lst

def test_date3():
    cdef list lst = []
    cdef int year, month, day
    for year in range(1000, 2001):
        for month in range(1,13):
            for day in range(1, 20):
                d = date_new(year, month, day)
                lst.append(d)
    return lst

def test1():
    l = test_date1()
    return l

def test2():
    l = test_date2()
    return l

def test3():
    l = test_date3()
    return l

Here are timings:

(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
mytests.test_date import test1" "test1()"
50 loops, best of 5: 83.2 msec per loop
(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
mytests.test_date import test2" "test2()"
50 loops, best of 5: 74.7 msec per loop
(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
mytests.test_date import test3" "test3()"
50 loops, best of 5: 20.9 msec per loop

OSX 10.6.8 64 bit python 3.2

Shibzukhov Zaur

From szport at gmail.com  Sun Mar  3 20:16:43 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Sun, 3 Mar 2013 22:16:43 +0300
Subject: [Cython] To Add datetime.pxd to cython.cpython
In-Reply-To: <CAPOE21Sw-7reRg+e53Mq6Ad5JnL12mdX+982yJZBAK4tyHfFXg@mail.gmail.com>
References: <CAPOE21SkWaQMYhA_b9zuzC--MugcdvrBZidQXHK1+56xf4GUgQ@mail.gmail.com>
	<5131DF66.6030403@behnel.de>
	<CAPOE21RJYF6kVw26RHXk5-QzBCcvjmaXq8twac+Aqe4Gx_xjCg@mail.gmail.com>
	<CAPOE21Sw-7reRg+e53Mq6Ad5JnL12mdX+982yJZBAK4tyHfFXg@mail.gmail.com>
Message-ID: <CAPOE21QSG_bnHMTQhLBe6Mfy4WMEVw6Og-JzxnC8URmEn_gWKw@mail.gmail.com>

2013/3/3 Zaur Shibzukhov <szport at gmail.com>:
> 2013/3/3 Zaur Shibzukhov <szport at gmail.com>:
>> 2013/3/2 Stefan Behnel <stefan_ml at behnel.de>:
>>> Hi,
>>>
>>> the last pull request looks good to me now.
>>>
>>> https://github.com/cython/cython/pull/189
>>>
>>> Any more comments on it?
>>
>> As was suggested earlier, I added `import_datetime` inline function to
>> initialize PyDateTime C API instead of direct usage of "non-native" C
>> macros from datetime.h.
>> Now you call `import_array ()` first in the same way as is done with `numpy`.
>>  This approach looks natural in the light of experience with numpy.
>>
>  I make some performance comparisons. Here example for dates.
>
> # test_date.pyx
> --------------------
>
> Here test code:
>
> from cpython.datetime cimport import_datetime, date_new, date
>
> import_datetime()
>
> from datetime import date as pydate
>
> def test_date1():
>     cdef list lst = []
>     for year in range(1000, 2001):
>         for month in range(1,13):
>             for day in range(1, 20):
>                 d = pydate(year, month, day)
>                 lst.append(d)
>     return lst
>
>
> def test_date2():
>     cdef list lst = []
>     for year in range(1000, 2001):
>         for month in range(1,13):
>             for day in range(1, 20):
>                 d = date(year, month, day)
>                 lst.append(d)
>     return lst
>
> def test_date3():
>     cdef list lst = []
>     cdef int year, month, day
>     for year in range(1000, 2001):
>         for month in range(1,13):
>             for day in range(1, 20):
>                 d = date_new(year, month, day)
>                 lst.append(d)
>     return lst
>
> def test1():
>     l = test_date1()
>     return l
>
> def test2():
>     l = test_date2()
>     return l
>
> def test3():
>     l = test_date3()
>     return l
>
> Here are timings:
>
> (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
> mytests.test_date import test1" "test1()"
> 50 loops, best of 5: 83.2 msec per loop
> (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
> mytests.test_date import test2" "test2()"
> 50 loops, best of 5: 74.7 msec per loop
> (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
> mytests.test_date import test3" "test3()"
> 50 loops, best of 5: 20.9 msec per loop
>
> OSX 10.6.8 64 bit python 3.2
>

More acurate test...

# coding: utf-8

from cpython.datetime cimport import_datetime, date_new, date

import_datetime()

from datetime import date as pydate

def test_date1():
    cdef list lst = []
    cdef int year, month, day
    for year in range(1000, 2001):
        for month in range(1,13):
            for day in range(1, 20):
                d = pydate(year, month, day)
                lst.append(d)
    return lst


def test_date2():
    cdef list lst = []
    cdef int year, month, day
    for year in range(1000, 2001):
        for month in range(1,13):
            for day in range(1, 20):
                d = date(year, month, day)
                lst.append(d)
    return lst

def test_date3():
    cdef list lst = []
    cdef int year, month, day
    for year in range(1000, 2001):
        for month in range(1,13):
            for day in range(1, 20):
                d = date_new(year, month, day)
                lst.append(d)
    return lst

def test1():
    l = test_date1()
    return l

def test2():
    l = test_date2()
    return l

def test3():
    l = test_date3()
    return l

Timings:

(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
mytests.test_date import test1" "test1()"
50 loops, best of 5: 83.3 msec per loop
(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
mytests.test_date import test2" "test2()"
50 loops, best of 5: 74.6 msec per loop
(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
mytests.test_date import test3" "test3()"
50 loops, best of 5: 20.8 msec per loop

Shibzukhov Zaur

From stefan_ml at behnel.de  Sun Mar  3 20:41:04 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 03 Mar 2013 20:41:04 +0100
Subject: [Cython] Py_UNICODE* string support
In-Reply-To: <op.wtdhmgbpyabthb@juga>
References: <op.wtc0wo10yabthb@juga> <51330151.5080300@behnel.de>
	<op.wtc20vlsyabthb@juga> <513318B4.3080803@behnel.de>
	<op.wtdhmgbpyabthb@juga>
Message-ID: <5133A750.5000203@behnel.de>

Nikita Nemkin, 03.03.2013 14:40:
> Please reconsider your decision wrt C-level literals.
> I believe that nogil code and a bit of efficiency (on 3.3) justify their
> existence. (char* literals do have C-level literals, Py_UNICODE* is in
> the same basket when it comes to Windows code).
> The code to support them is also small and well-contained.
> I've updated my pull request to fully support for non-BMP Py_UNICODE[]
> literals.

Ok, I think it's ok now. I can accept the special casing of Py_UNICODE
literals, it actually adds a value.

As one little nit-pick, may I ask you to rename the new name references to
"unicode" into "py_unicode" in your code? For example, "is_unicode",
"get_unicode_const", "unicode_const_index", etc. Given that Py_UNICODE is
no longer the native equivalent of Python's unicode type in Py3.3, I'd like
to avoid confusion in the code. The name "unicode" is much more likely to
refer to the builtin Python type than to a native C type when it appears in
Cython's sources.

Stefan


From stefan_ml at behnel.de  Sun Mar  3 20:56:59 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 03 Mar 2013 20:56:59 +0100
Subject: [Cython] Py_UNICODE* string support
In-Reply-To: <5133A750.5000203@behnel.de>
References: <op.wtc0wo10yabthb@juga> <51330151.5080300@behnel.de>
	<op.wtc20vlsyabthb@juga> <513318B4.3080803@behnel.de>
	<op.wtdhmgbpyabthb@juga> <5133A750.5000203@behnel.de>
Message-ID: <5133AB0B.4060208@behnel.de>

Stefan Behnel, 03.03.2013 20:41:
> Nikita Nemkin, 03.03.2013 14:40:
>> Please reconsider your decision wrt C-level literals.
>> I believe that nogil code and a bit of efficiency (on 3.3) justify their
>> existence. (char* literals do have C-level literals, Py_UNICODE* is in
>> the same basket when it comes to Windows code).
>> The code to support them is also small and well-contained.
>> I've updated my pull request to fully support for non-BMP Py_UNICODE[]
>> literals.
> 
> Ok, I think it's ok now. I can accept the special casing of Py_UNICODE
> literals, it actually adds a value.
> 
> As one little nit-pick, may I ask you to rename the new name references to
> "unicode" into "py_unicode" in your code? For example, "is_unicode",
> "get_unicode_const", "unicode_const_index", etc. Given that Py_UNICODE is
> no longer the native equivalent of Python's unicode type in Py3.3, I'd like
> to avoid confusion in the code. The name "unicode" is much more likely to
> refer to the builtin Python type than to a native C type when it appears in
> Cython's sources.

Oh, and yet another thing: could you write up some documentation for this
in docs/src/tutorial/strings.rst ? Basically a Windows/wchar_t related
section, that also warns about the inefficiency in Py3.3, so that users
don't accidentally assume it's efficient for anything that needs to be
portable.

Stefan


From szport at gmail.com  Mon Mar  4 07:24:30 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Mon, 4 Mar 2013 09:24:30 +0300
Subject: [Cython] To Add datetime.pxd to cython.cpython
In-Reply-To: <CAPOE21QSG_bnHMTQhLBe6Mfy4WMEVw6Og-JzxnC8URmEn_gWKw@mail.gmail.com>
References: <CAPOE21SkWaQMYhA_b9zuzC--MugcdvrBZidQXHK1+56xf4GUgQ@mail.gmail.com>
	<5131DF66.6030403@behnel.de>
	<CAPOE21RJYF6kVw26RHXk5-QzBCcvjmaXq8twac+Aqe4Gx_xjCg@mail.gmail.com>
	<CAPOE21Sw-7reRg+e53Mq6Ad5JnL12mdX+982yJZBAK4tyHfFXg@mail.gmail.com>
	<CAPOE21QSG_bnHMTQhLBe6Mfy4WMEVw6Og-JzxnC8URmEn_gWKw@mail.gmail.com>
Message-ID: <CAPOE21SKf4nJzP_DFBPxnw_tBtfje_ncsA-4dJDSRZReQdzE0Q@mail.gmail.com>

2013/3/3 Zaur Shibzukhov <szport at gmail.com>:
> 2013/3/3 Zaur Shibzukhov <szport at gmail.com>:
>> 2013/3/3 Zaur Shibzukhov <szport at gmail.com>:
>>> 2013/3/2 Stefan Behnel <stefan_ml at behnel.de>:
>>>> Hi,
>>>>
>>>> the last pull request looks good to me now.
>>>>
>>>> https://github.com/cython/cython/pull/189
>>>>
>>>> Any more comments on it?
>>>
>>> As was suggested earlier, I added `import_datetime` inline function to
>>> initialize PyDateTime C API instead of direct usage of "non-native" C
>>> macros from datetime.h.
>>> Now you call `import_array ()` first in the same way as is done with `numpy`.
>>>  This approach looks natural in the light of experience with numpy.
>>>
>>  I make some performance comparisons. Here example for dates.
>>
>> # test_date.pyx
>> --------------------
>>
>> Here test code:
>>
>> from cpython.datetime cimport import_datetime, date_new, date
>>
>> import_datetime()
>>
>> from datetime import date as pydate
>>
>> def test_date1():
>>     cdef list lst = []
>>     for year in range(1000, 2001):
>>         for month in range(1,13):
>>             for day in range(1, 20):
>>                 d = pydate(year, month, day)
>>                 lst.append(d)
>>     return lst
>>
>>
>> def test_date2():
>>     cdef list lst = []
>>     for year in range(1000, 2001):
>>         for month in range(1,13):
>>             for day in range(1, 20):
>>                 d = date(year, month, day)
>>                 lst.append(d)
>>     return lst
>>
>> def test_date3():
>>     cdef list lst = []
>>     cdef int year, month, day
>>     for year in range(1000, 2001):
>>         for month in range(1,13):
>>             for day in range(1, 20):
>>                 d = date_new(year, month, day)
>>                 lst.append(d)
>>     return lst
>>
>> def test1():
>>     l = test_date1()
>>     return l
>>
>> def test2():
>>     l = test_date2()
>>     return l
>>
>> def test3():
>>     l = test_date3()
>>     return l
>>
>> Here are timings:
>>
>> (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
>> mytests.test_date import test1" "test1()"
>> 50 loops, best of 5: 83.2 msec per loop
>> (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
>> mytests.test_date import test2" "test2()"
>> 50 loops, best of 5: 74.7 msec per loop
>> (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
>> mytests.test_date import test3" "test3()"
>> 50 loops, best of 5: 20.9 msec per loop
>>
>> OSX 10.6.8 64 bit python 3.2
>>
>
> More acurate test...
>
> # coding: utf-8
>
> from cpython.datetime cimport import_datetime, date_new, date
>
> import_datetime()
>
> from datetime import date as pydate
>
> def test_date1():
>     cdef list lst = []
>     cdef int year, month, day
>     for year in range(1000, 2001):
>         for month in range(1,13):
>             for day in range(1, 20):
>                 d = pydate(year, month, day)
>                 lst.append(d)
>     return lst
>
>
> def test_date2():
>     cdef list lst = []
>     cdef int year, month, day
>     for year in range(1000, 2001):
>         for month in range(1,13):
>             for day in range(1, 20):
>                 d = date(year, month, day)
>                 lst.append(d)
>     return lst
>
> def test_date3():
>     cdef list lst = []
>     cdef int year, month, day
>     for year in range(1000, 2001):
>         for month in range(1,13):
>             for day in range(1, 20):
>                 d = date_new(year, month, day)
>                 lst.append(d)
>     return lst
>
> def test1():
>     l = test_date1()
>     return l
>
> def test2():
>     l = test_date2()
>     return l
>
> def test3():
>     l = test_date3()
>     return l
>
> Timings:
>
> (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
> mytests.test_date import test1" "test1()"
> 50 loops, best of 5: 83.3 msec per loop
> (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
> mytests.test_date import test2" "test2()"
> 50 loops, best of 5: 74.6 msec per loop
> (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
> mytests.test_date import test3" "test3()"
> 50 loops, best of 5: 20.8 msec per loop

Yet another performance comparison for `time`:

# coding: utf-8

from cpython.datetime cimport import_datetime, time_new, time

import_datetime()

from datetime import time as pytime

def test_time1():
    cdef list lst = []
    cdef int hour, minute, second, microsecond
    for hour in range(0, 24):
        for minute in range(0,60):
            for second in range(0, 60):
                for microsecond in range(0, 100000, 50000):
                    d = pytime(hour, minute, second, microsecond)
                    lst.append(d)
    return lst


def test_time2():
    cdef list lst = []
    cdef int hour, minute, second, microsecond
    for hour in range(0, 24):
        for minute in range(0,60):
            for second in range(0, 60):
                for microsecond in range(0, 100000, 50000):
                    d = time(hour, minute, second, microsecond)
                    lst.append(d)
    return lst

def test_time3():
    cdef list lst = []
    cdef int hour, minute, second, microsecond
    for hour in range(0, 24):
        for minute in range(0,60):
            for second in range(0, 60):
                for microsecond in range(0, 100000, 50000):
                    d = time_new(hour, minute, second, microsecond, None)
                    lst.append(d)
    return lst

def test1():
    l = test_time1()
    return l

def test2():
    l = test_time2()
    return l

def test3():
    l = test_time3()
    return l

Timings:

(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
mytests.test_time import test1" "test1()"
50 loops, best of 5: 72.2 msec per loop
(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
mytests.test_time import test2" "test2()"
50 loops, best of 5: 64.7 msec per loop
(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s "from
mytests.test_time import test3" "test3()"
50 loops, best of 5: 13 msec per loop

Sure the same kind of results might expect for `datetime` too.

Shibzukhov Zaur

From sturla at molden.no  Mon Mar  4 11:32:02 2013
From: sturla at molden.no (Sturla Molden)
Date: Mon, 4 Mar 2013 11:32:02 +0100
Subject: [Cython] PR on refcounting memoryview buffers
In-Reply-To: <15C80BD0-302E-4576-ACF3-C0FFD700569B@molden.no>
References: <512273C8.4000005@molden.no>
	<15C80BD0-302E-4576-ACF3-C0FFD700569B@molden.no>
Message-ID: <FD4F6861-199F-4BF2-92DC-44882C908C6E@molden.no>

Den 20. feb. 2013 kl. 11:55 skrev Sturla Molden <sturla at molden.no>:

> 
> Den 18. feb. 2013 kl. 19:32 skrev Sturla Molden <sturla at molden.no>:
> 
>> The problem this addresses is when GCC does not use atomic builtins and emits __synch_fetch_and_add_4 and __synch_fetch_and_sub_4 when Cython are internally refcounting memoryview buffers. For some reason it can even happen on x86 and amd64.
> 
> Specifically, atomic builtins are not used when compiling for i386, which is MinGWs default target architecture (unless we specify a different -march). GCC will always encounter this problem when targeting i386.
> 
> Thus the correct fix is to use fallback when GCC is targeting i386 ? not when GCC is targeting MS Windows. 
> 
> So I am closing this PR. But Mark's fix must be corrected, because it does not really address the problem (which is i386, not MinGW)! 
> 


Please consider this pull-request:

https://github.com/cython/cython/pull/190


Sturla Molden


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130304/e1a2fd91/attachment.html>

From nikita at nemkin.ru  Mon Mar  4 18:39:36 2013
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Mon, 04 Mar 2013 23:39:36 +0600
Subject: [Cython] Py_UNICODE* string support
In-Reply-To: <5133AB0B.4060208@behnel.de>
References: <op.wtc0wo10yabthb@juga> <51330151.5080300@behnel.de>
	<op.wtc20vlsyabthb@juga> <513318B4.3080803@behnel.de>
	<op.wtdhmgbpyabthb@juga>
	<5133A750.5000203@behnel.de> <5133AB0B.4060208@behnel.de>
Message-ID: <op.wtfnca10yabthb@juga>

On Mon, 04 Mar 2013 01:56:59 +0600, Stefan Behnel <stefan_ml at behnel.de>  
wrote:

> As one little nit-pick, may I ask you to rename the new name references  
> to "unicode" into "py_unicode" in your code? For example, "is_unicode",
> "get_unicode_const", "unicode_const_index", etc. Given that Py_UNICODE is
> no longer the native equivalent of Python's unicode type in Py3.3, I'd  
> like to avoid confusion in the code. The name "unicode" is much more  
> likely to
> refer to the builtin Python type than to a native C type when it appears  
> in Cython's sources.

Actually, "py_unicode" is even more likely to be mistaken for Python-level
unicode. There are already pairs of methods like
get_string_const (C-level) + get_py_string_const (Py-level).

I suggest one of "py_unicode_ptr", "py_unicode_str", "wstring",  
"wide_string",
"ustring", "unicode_string" to unambiguously refer to Py_UNICODE* variables
and constants. Take yout pick.


> Oh, and yet another thing: could you write up some documentation for this
> in docs/src/tutorial/strings.rst ? Basically a Windows/wchar_t related
> section, that also warns about the inefficiency in Py3.3, so that users
> don't accidentally assume it's efficient for anything that needs to be
> portable.

Sure, I'm writing the docs now.


Best regards,
Nikita Nemkin

From stefan_ml at behnel.de  Mon Mar  4 18:58:34 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 04 Mar 2013 18:58:34 +0100
Subject: [Cython] Py_UNICODE* string support
In-Reply-To: <op.wtfnca10yabthb@juga>
References: <op.wtc0wo10yabthb@juga> <51330151.5080300@behnel.de>
	<op.wtc20vlsyabthb@juga> <513318B4.3080803@behnel.de>
	<op.wtdhmgbpyabthb@juga> <5133A750.5000203@behnel.de>
	<5133AB0B.4060208@behnel.de> <op.wtfnca10yabthb@juga>
Message-ID: <5134E0CA.2060106@behnel.de>

Nikita Nemkin, 04.03.2013 18:39:
> On Mon, 04 Mar 2013 01:56:59 +0600, Stefan Behnel wrote:
>> As one little nit-pick, may I ask you to rename the new name references
>> to "unicode" into "py_unicode" in your code? For example, "is_unicode",
>> "get_unicode_const", "unicode_const_index", etc. Given that Py_UNICODE is
>> no longer the native equivalent of Python's unicode type in Py3.3, I'd
>> like to avoid confusion in the code. The name "unicode" is much more
>> likely to
>> refer to the builtin Python type than to a native C type when it appears
>> in Cython's sources.
> 
> Actually, "py_unicode" is even more likely to be mistaken for Python-level
> unicode. There are already pairs of methods like
> get_string_const (C-level) + get_py_string_const (Py-level).

Agreed.


> I suggest one of "py_unicode_ptr", "py_unicode_str", "wstring", "wide_string",
> "ustring", "unicode_string" to unambiguously refer to Py_UNICODE* variables
> and constants. Take yout pick.

I think "pyunicode_ptr" or even just "pyunicode" makes it quite clear what
it's about and specifically that "pyunicode" is actually a type name, not a
"py_something". Even "pyunicode_array" would work, although it might
suggest that we know more at compile time than we do, such as the length.

I'll let you choose between these three, although I'm leaning slightly
towards an order of preference as they appear above.


>> Oh, and yet another thing: could you write up some documentation for this
>> in docs/src/tutorial/strings.rst ? Basically a Windows/wchar_t related
>> section, that also warns about the inefficiency in Py3.3, so that users
>> don't accidentally assume it's efficient for anything that needs to be
>> portable.
> 
> Sure, I'm writing the docs now.

Nice.

Stefan


From szport at gmail.com  Tue Mar  5 07:21:21 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Tue, 5 Mar 2013 09:21:21 +0300
Subject: [Cython] nonecheck and as_none_safe_node method
Message-ID: <CAPOE21RZk_7datq6nXqNp_KW_3HjyNYCCr7aqPNkxonkVhWu=w@mail.gmail.com>

In ExprNodes.py there are several places where method `as_none_safe_node`
was applied in order to wrap a node by NoneCheckNode.
I think it would be resonable to apply that mostly only in cases when
noncheck=True.

Here are possible changes in ExprNodes.py:
https://github.com/intellimath/cython/commit/bd041680b78067007ad6b9894a2f2c18514e397c

Zaur Shibzukhov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130305/70a4766f/attachment.html>

From szport at gmail.com  Tue Mar  5 07:24:42 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Tue, 5 Mar 2013 09:24:42 +0300
Subject: [Cython] nonecheck and as_none_safe_node method
In-Reply-To: <CAPOE21RZk_7datq6nXqNp_KW_3HjyNYCCr7aqPNkxonkVhWu=w@mail.gmail.com>
References: <CAPOE21RZk_7datq6nXqNp_KW_3HjyNYCCr7aqPNkxonkVhWu=w@mail.gmail.com>
Message-ID: <CAPOE21QvJegzXW69k4W4rKf4TecCA=uVYroa7r9kPVfFMFMv+w@mail.gmail.com>

2013/3/5 Zaur Shibzukhov <szport at gmail.com>

> In ExprNodes.py there are several places where method `as_none_safe_node`
> was applied in order to wrap a node by NoneCheckNode.
> I think it would be resonable to apply that mostly only in cases when
> noncheck=True.
>
> Here are possible changes in ExprNodes.py:
>
> https://github.com/intellimath/cython/commit/bd041680b78067007ad6b9894a2f2c18514e397c
>
> This change would prevent generation of None checking of an objects
(lists, tuples, unicode) when nonecheck=True.

Any adeas?


Zaur Shibzukhov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130305/d9c014ea/attachment.html>

From szport at gmail.com  Tue Mar  5 07:26:52 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Tue, 5 Mar 2013 09:26:52 +0300
Subject: [Cython] nonecheck and as_none_safe_node method
In-Reply-To: <CAPOE21QvJegzXW69k4W4rKf4TecCA=uVYroa7r9kPVfFMFMv+w@mail.gmail.com>
References: <CAPOE21RZk_7datq6nXqNp_KW_3HjyNYCCr7aqPNkxonkVhWu=w@mail.gmail.com>
	<CAPOE21QvJegzXW69k4W4rKf4TecCA=uVYroa7r9kPVfFMFMv+w@mail.gmail.com>
Message-ID: <CAPOE21TK4GxskRaHoQEK28ZrXsh3027T6-yMapiU4S+6QSseGw@mail.gmail.com>

2013/3/5 Zaur Shibzukhov <szport at gmail.com>

> 2013/3/5 Zaur Shibzukhov <szport at gmail.com>
>
>> In ExprNodes.py there are several places where method `as_none_safe_node`
>> was applied in order to wrap a node by NoneCheckNode.
>> I think it would be resonable to apply that mostly only in cases when
>> noncheck=True.
>>
>> Here are possible changes in ExprNodes.py:
>>
>> https://github.com/intellimath/cython/commit/bd041680b78067007ad6b9894a2f2c18514e397c
>>
>> This change would prevent generation of None checking of an objects
> (lists, tuples, unicode) when nonecheck=True.
>

Sorry... when  nonecheck=False


> Any adeas?
>
>


 Zaur Shibzukhov


-- 
? ?????????,
???????? ?.?.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130305/399e5f88/attachment.html>

From stefan_ml at behnel.de  Tue Mar  5 08:12:16 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 05 Mar 2013 08:12:16 +0100
Subject: [Cython] nonecheck and as_none_safe_node method
In-Reply-To: <CAPOE21RZk_7datq6nXqNp_KW_3HjyNYCCr7aqPNkxonkVhWu=w@mail.gmail.com>
References: <CAPOE21RZk_7datq6nXqNp_KW_3HjyNYCCr7aqPNkxonkVhWu=w@mail.gmail.com>
Message-ID: <51359AD0.50209@behnel.de>

Zaur Shibzukhov, 05.03.2013 07:21:
> In ExprNodes.py there are several places where method `as_none_safe_node`
> was applied in order to wrap a node by NoneCheckNode.
> I think it would be resonable to apply that mostly only in cases when
> noncheck=True.
> 
> Here are possible changes in ExprNodes.py:
> https://github.com/intellimath/cython/commit/bd041680b78067007ad6b9894a2f2c18514e397c

I consider the nonecheck option a quirk. In many, many cases, it's not
obvious to a user to what constructs it applies. For example, we use it to
guard against crashes when we optimise code, e.g. by inlining parts of a
C-API function, when iterating over builtins, etc. In most of these cases,
it depends on more than one parameter if the optimised code will be applied
(and thus no None check) or the fallback, which usually does its own
complete set of safety checks. So it's one of those options that may work
safely in all unit tests and then crash in production.

Remember, most cases where we leave a None check in the code are not those
where it's obvious that a variable cannot be None because it was just
assigned a non-None value. Most cases are about function arguments, i.e.
values that come from outside of the current function, and thus are not
"obviously" correct even for the human reader or author of the code.

Also, I'm yet to see a case where a None check really makes a difference in
performance. Often enough, the C compiler will be able to move them out of
loops or drop them completely because it already saw a None check against
the same local variable earlier on. In those cases, it's just Cython not
being smart enough to drop them itself, but without an impact on runtime
performance. And even if the C compiler is not smart enough either, at
least the branch prediction of the processor will strike in the relevant
cases (i.e. inside of loops) and reduce the overhead to "pretty much zero".

All of this makes em think that we should be very careful when we consider
this option for the generated code.

Stefan


From szport at gmail.com  Tue Mar  5 08:30:26 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Tue, 5 Mar 2013 10:30:26 +0300
Subject: [Cython] nonecheck and as_none_safe_node method
In-Reply-To: <51359AD0.50209@behnel.de>
References: <CAPOE21RZk_7datq6nXqNp_KW_3HjyNYCCr7aqPNkxonkVhWu=w@mail.gmail.com>
	<51359AD0.50209@behnel.de>
Message-ID: <CAPOE21S76=nOtJ6ny43O0t0HjMtpJ3n3GvH_nDB7eXZXQPsNzw@mail.gmail.com>

2013/3/5 Stefan Behnel <stefan_ml at behnel.de>

> Zaur Shibzukhov, 05.03.2013 07:21:
> > In ExprNodes.py there are several places where method `as_none_safe_node`
> > was applied in order to wrap a node by NoneCheckNode.
> > I think it would be resonable to apply that mostly only in cases when
> > noncheck=True.
> >
> > Here are possible changes in ExprNodes.py:
> >
> https://github.com/intellimath/cython/commit/bd041680b78067007ad6b9894a2f2c18514e397c
>
> I consider the nonecheck option a quirk. In many, many cases, it's not
> obvious to a user to what constructs it applies. For example, we use it to
> guard against crashes when we optimise code, e.g. by inlining parts of a
> C-API function, when iterating over builtins, etc. In most of these cases,
> it depends on more than one parameter if the optimised code will be applied
> (and thus no None check) or the fallback, which usually does its own
> complete set of safety checks. So it's one of those options that may work
> safely in all unit tests and then crash in production.
>
> Remember, most cases where we leave a None check in the code are not those
> where it's obvious that a variable cannot be None because it was just
> assigned a non-None value. Most cases are about function arguments, i.e.
> values that come from outside of the current function, and thus are not
> "obviously" correct even for the human reader or author of the code.
>
> Also, I'm yet to see a case where a None check really makes a difference in
> performance. Often enough, the C compiler will be able to move them out of
> loops or drop them completely because it already saw a None check against
> the same local variable earlier on. In those cases, it's just Cython not
> being smart enough to drop them itself, but without an impact on runtime
> performance. And even if the C compiler is not smart enough either, at
> least the branch prediction of the processor will strike in the relevant
> cases (i.e. inside of loops) and reduce the overhead to "pretty much zero".
>
> All of this makes em think that we should be very careful when we consider
> this option for the generated code.
>
>
 I agree that directive nonecheck=False is dangarous in general.

This change mainly affect builtin objects (lists, tuples, dicts, unicode)
and some situation in function/method calls.
And it assume that when you apply this directive you know what you are
doing and why.

Note that Cython now already set nonecheck=False (but boundcheck and
wraparound sets to True in Options.py) by default.
But now it not affect builtin types and some special cases.

May be the safer strategy is to set nonecheck=True by default and allow
locally to apply nonecheck(False) when developer believes that it is
necessary?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130305/8d0d76e8/attachment.html>

From szport at gmail.com  Tue Mar  5 11:24:33 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Tue, 5 Mar 2013 13:24:33 +0300
Subject: [Cython] nonecheck and as_none_safe_node method
In-Reply-To: <CAPOE21S76=nOtJ6ny43O0t0HjMtpJ3n3GvH_nDB7eXZXQPsNzw@mail.gmail.com>
References: <CAPOE21RZk_7datq6nXqNp_KW_3HjyNYCCr7aqPNkxonkVhWu=w@mail.gmail.com>
	<51359AD0.50209@behnel.de>
	<CAPOE21S76=nOtJ6ny43O0t0HjMtpJ3n3GvH_nDB7eXZXQPsNzw@mail.gmail.com>
Message-ID: <CAPOE21TVjmfwVLJO3iCvB2o-pMKdE_TRpUrZGCQZF=PNTsCSyw@mail.gmail.com>

2013/3/5 Zaur Shibzukhov <szport at gmail.com>

>
> May be the safer strategy is to set nonecheck=True by default and allow
> locally to apply nonecheck(False) when developer believes that it is
> necessary?
>

Strategy of making nonecheck=True by default and setting explicitly
nonecheck=False when it's necessary is more manageable IMHO.
In Cython sources one can explicitly (where it's necessary) ignore this
default setting, or set it explicitly to False in concrete
context/environment.

Zaur Shibzukhov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130305/90dce51e/attachment.html>

From yury at shurup.com  Thu Mar  7 12:16:10 2013
From: yury at shurup.com (Yury V. Zaytsev)
Date: Thu, 07 Mar 2013 12:16:10 +0100
Subject: [Cython] Cython syntax to pre-allocate lists for performance
Message-ID: <1362654970.2849.9.camel@newpride>

Hi,

Is there any syntax that I can use to do something like this in Cython:

    py_object_ = PyList_New(123); ?

If not, do you think that this can be added in one way or another?

Unfortunately, I can't think of a non-disruptive way of doing it. For
instance, if this

    [None] * N

is given a completely new meaning, like make an empty list (of NULLs),
instead of making a real list of Nones, it will certainly break Python
code. Besides, it would probably be still faster than no pre-allocation,
but slower than an empty list with pre-allocation...

Maybe

    [NULL] * N ?

Any ideas?

-- 
Sincerely yours,
Yury V. Zaytsev


From stefan_ml at behnel.de  Thu Mar  7 12:21:39 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 07 Mar 2013 12:21:39 +0100
Subject: [Cython] Cython syntax to pre-allocate lists for performance
In-Reply-To: <1362654970.2849.9.camel@newpride>
References: <1362654970.2849.9.camel@newpride>
Message-ID: <51387843.602@behnel.de>

Yury V. Zaytsev, 07.03.2013 12:16:
> Is there any syntax that I can use to do something like this in Cython:
> 
>     py_object_ = PyList_New(123); ?

Note that Python has an algorithm for shrinking a list on appending, so
this might not be sufficient for your use case.


> If not, do you think that this can be added in one way or another?
> 
> Unfortunately, I can't think of a non-disruptive way of doing it. For
> instance, if this
> 
>     [None] * N
> 
> is given a completely new meaning, like make an empty list (of NULLs),
> instead of making a real list of Nones, it will certainly break Python
> code. Besides, it would probably be still faster than no pre-allocation,
> but slower than an empty list with pre-allocation...
> 
> Maybe
> 
>     [NULL] * N ?

What do you need it for?

Won't list comprehensions work for you? They could potentially be adapted
to presize the list.

And why won't [None]*N help you out? It should be pretty cheap.

Stefan


From nikita at nemkin.ru  Thu Mar  7 12:59:22 2013
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Thu, 07 Mar 2013 17:59:22 +0600
Subject: [Cython] Cython syntax to pre-allocate lists for performance
In-Reply-To: <51387843.602@behnel.de>
References: <1362654970.2849.9.camel@newpride> <51387843.602@behnel.de>
Message-ID: <op.wtkrk8ghyabthb@juga>

On Thu, 07 Mar 2013 17:16:10 +0600, Yury V. Zaytsev <yury at shurup.com>  
wrote:

> Hi,
>
> Is there any syntax that I can use to do something like this in Cython:
>
>     py_object_ = PyList_New(123); ?
>
> If not, do you think that this can be added in one way or another?
>
> Unfortunately, I can't think of a non-disruptive way of doing it. For
> instance, if this
>
>     [None] * N
>
> is given a completely new meaning, like make an empty list (of NULLs),
> instead of making a real list of Nones, it will certainly break Python
> code. Besides, it would probably be still faster than no pre-allocation,
> but slower than an empty list with pre-allocation...
>
> Maybe
>
>     [NULL] * N ?
>
> Any ideas?
>

I really like the "[NULL] * N" thing.

Efficient empty list allocation and filling is something I stumble upon
quite often, especially in binding code.

I doubt Cython will be able to automatically use PyList_SET_ITEM
for assignment to such NULL list (it would require induction variable
analysis), but eliminating one extra pass over the list is already helpful.

Implementation note (if this gets implemented):
Cython's optimized list assignment routine uses Py_DECREF, this will
have to be changed to Py_XDECREF, otherwise NULL list items won't be
directly assignable from Cython.
(PyList_SetItem always uses Py_XDECREF on the old element).


> What do you need it for?
>
> Won't list comprehensions work for you? They could potentially be adapted
> to presize the list.

List comprehensions do not preallocate the list.
If they did, the need for the above would be somewhat diminished.

> And why won't [None]*N help you out? It should be pretty cheap.

[None] * N makes

From nikita at nemkin.ru  Thu Mar  7 13:02:17 2013
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Thu, 07 Mar 2013 18:02:17 +0600
Subject: [Cython] Cython syntax to pre-allocate lists for performance
In-Reply-To: <51387843.602@behnel.de>
References: <1362654970.2849.9.camel@newpride> <51387843.602@behnel.de>
Message-ID: <op.wtkrp3vgyabthb@juga>

Sorry, accidental early send. Previous mail continued...

[None] * N makes an extra pass over the list to assign None to each item
(and also incref None n times).
This is useless extra work. The larget the list, the worse it gets.


Best regards,
Nikita Nemkin

From szport at gmail.com  Thu Mar  7 14:39:33 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Thu, 7 Mar 2013 16:39:33 +0300
Subject: [Cython] Cython syntax to pre-allocate lists for performance
In-Reply-To: <51387843.602@behnel.de>
References: <1362654970.2849.9.camel@newpride> <51387843.602@behnel.de>
Message-ID: <CAPOE21SRppQUNoVCWjm0hj+tcHaw=KH9BAd3FpgLVPsAaggqrg@mail.gmail.com>

2013/3/7 Stefan Behnel <stefan_ml at behnel.de>

> Yury V. Zaytsev, 07.03.2013 12:16:
> > Is there any syntax that I can use to do something like this in Cython:
> >
> >     py_object_ = PyList_New(123); ?
>
> Note that Python has an algorithm for shrinking a list on appending, so
> this might not be sufficient for your use case.
>
>
> > If not, do you think that this can be added in one way or another?
> >
> > Unfortunately, I can't think of a non-disruptive way of doing it. For
> > instance, if this
> >
> >     [None] * N
> >
> > is given a completely new meaning, like make an empty list (of NULLs),
> > instead of making a real list of Nones, it will certainly break Python
> > code. Besides, it would probably be still faster than no pre-allocation,
> > but slower than an empty list with pre-allocation...
> >
> > Maybe
> >
> >     [NULL] * N ?
>
> What do you need it for?
>
> Won't list comprehensions work for you? They could potentially be adapted
> to presize the list.
>
>
I guess the problem is to construct new (even empty) list with
pre-allocated memory exactly for N elements.

N*[NULL] - changes semantics because there can't be list with N elements
and filled by NULL.
N*[None] - more expansive for further assignments because of Py_DECREFs.

I suppose that N*[] could do the trick. It could be optimized so that N*[]
is equal to an empty list but with preallocated memory exactly for N
elements. Could it be?

Zaur Shibzukhov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130307/b16893ac/attachment.html>

From szport at gmail.com  Thu Mar  7 15:39:32 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Thu, 7 Mar 2013 17:39:32 +0300
Subject: [Cython] Cython syntax to pre-allocate lists for performance
In-Reply-To: <CAPOE21SRppQUNoVCWjm0hj+tcHaw=KH9BAd3FpgLVPsAaggqrg@mail.gmail.com>
References: <1362654970.2849.9.camel@newpride> <51387843.602@behnel.de>
	<CAPOE21SRppQUNoVCWjm0hj+tcHaw=KH9BAd3FpgLVPsAaggqrg@mail.gmail.com>
Message-ID: <CAPOE21RF3m0irqy3SdRC+RuFz1b7rsjHjoByrHjf9vr2EoJx8Q@mail.gmail.com>

2013/3/7 Zaur Shibzukhov <szport at gmail.com>

> I guess the problem is to construct new (even empty) list with
> pre-allocated memory exactly for N elements.
>
> N*[NULL] - changes semantics because there can't be list with N elements
> and filled by NULL.
> N*[None] - more expansive for further assignments because of Py_DECREFs.
>
> I suppose that N*[] could do the trick. It could be optimized so that N*[]
> is equal to an empty list but with preallocated memory exactly for N
> elements. Could it be?
>
>
Cython optimize already PyList_Append very well. Theofore scenario when
first one create empty list with exactly prealocated memory for N elements
 and second eval elements of the list and add them using plain list.append
could optimized in Cython very well too. As result constructed list will
contain memory only for N elements. This allows don't vast memory when one
need to build many many lists with relative big size.

Zaur Shibzukhov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130307/186df5cf/attachment-0001.html>

From stefan_ml at behnel.de  Thu Mar  7 15:48:41 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 07 Mar 2013 15:48:41 +0100
Subject: [Cython] Cython syntax to pre-allocate lists for performance
In-Reply-To: <CAPOE21RF3m0irqy3SdRC+RuFz1b7rsjHjoByrHjf9vr2EoJx8Q@mail.gmail.com>
References: <1362654970.2849.9.camel@newpride> <51387843.602@behnel.de>
	<CAPOE21SRppQUNoVCWjm0hj+tcHaw=KH9BAd3FpgLVPsAaggqrg@mail.gmail.com>
	<CAPOE21RF3m0irqy3SdRC+RuFz1b7rsjHjoByrHjf9vr2EoJx8Q@mail.gmail.com>
Message-ID: <5138A8C9.3050605@behnel.de>

Zaur Shibzukhov, 07.03.2013 15:39:
> 2013/3/7 Zaur Shibzukhov
> 
>> I guess the problem is to construct new (even empty) list with
>> pre-allocated memory exactly for N elements.
>>
>> N*[NULL] - changes semantics because there can't be list with N elements
>> and filled by NULL.
>> N*[None] - more expansive for further assignments because of Py_DECREFs.
>>
>> I suppose that N*[] could do the trick.

That looks wrong to me.


>> It could be optimized so that N*[]
>> is equal to an empty list but with preallocated memory exactly for N
>> elements. Could it be?
>
> Cython optimize already PyList_Append very well. Theofore scenario when
> first one create empty list with exactly prealocated memory for N elements
>  and second eval elements of the list and add them using plain list.append
> could optimized in Cython very well too. As result constructed list will
> contain memory only for N elements. This allows don't vast memory when one
> need to build many many lists with relative big size.

I prefer not adding any new syntax as long as we are not sure we can't fix
it by making list comprehensions smarter. I tried this a while ago and have
some initial code lying around somewhere in my patch queue. Didn't have the
time to make it any usable, though, also because Cython didn't have its own
append() for list comprehensions at the time. It does now, as you noted.

Stefan


From szport at gmail.com  Thu Mar  7 18:15:45 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Thu, 7 Mar 2013 20:15:45 +0300
Subject: [Cython] Add support for list/tuple slicing
Message-ID: <CAPOE21T8_Dotxt7J2zcjaFbmn-XObXZ3ojPemGbaKb1O0J-STQ@mail.gmail.com>

Hello!

Current Cython generate for slicing of list/tuple general
PySequence_GetSlice/SetSlice call.
We could replace that to native call for Py{List|Tuple}_GetSlice and
PyList_SetSlice for lists/tuples.

Here are the changes:
https://github.com/intellimath/cython/commit/27525a5dc9f6eba31b330a6ec04e7a105191d9f5

Zaur Shibzukhov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130307/49f4ddbb/attachment.html>

From robertwb at gmail.com  Thu Mar  7 19:07:53 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Thu, 7 Mar 2013 10:07:53 -0800
Subject: [Cython] Cython syntax to pre-allocate lists for performance
In-Reply-To: <5138A8C9.3050605@behnel.de>
References: <1362654970.2849.9.camel@newpride> <51387843.602@behnel.de>
	<CAPOE21SRppQUNoVCWjm0hj+tcHaw=KH9BAd3FpgLVPsAaggqrg@mail.gmail.com>
	<CAPOE21RF3m0irqy3SdRC+RuFz1b7rsjHjoByrHjf9vr2EoJx8Q@mail.gmail.com>
	<5138A8C9.3050605@behnel.de>
Message-ID: <CADiQ+QAzVMN7izmtGXWn0SCw5rq-CdoBmU12r2Oc_OB8=Q7f5g@mail.gmail.com>

On Thu, Mar 7, 2013 at 6:48 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Zaur Shibzukhov, 07.03.2013 15:39:
>> 2013/3/7 Zaur Shibzukhov
>>
>>> I guess the problem is to construct new (even empty) list with
>>> pre-allocated memory exactly for N elements.
>>>
>>> N*[NULL] - changes semantics because there can't be list with N elements
>>> and filled by NULL.
>>> N*[None] - more expansive for further assignments because of Py_DECREFs.
>>>
>>> I suppose that N*[] could do the trick.
>
> That looks wrong to me.
>
>
>>> It could be optimized so that N*[]
>>> is equal to an empty list but with preallocated memory exactly for N
>>> elements. Could it be?
>>
>> Cython optimize already PyList_Append very well. Theofore scenario when
>> first one create empty list with exactly prealocated memory for N elements
>>  and second eval elements of the list and add them using plain list.append
>> could optimized in Cython very well too. As result constructed list will
>> contain memory only for N elements. This allows don't vast memory when one
>> need to build many many lists with relative big size.
>
> I prefer not adding any new syntax as long as we are not sure we can't fix
> it by making list comprehensions smarter. I tried this a while ago and have
> some initial code lying around somewhere in my patch queue. Didn't have the
> time to make it any usable, though, also because Cython didn't have its own
> append() for list comprehensions at the time. It does now, as you noted.

There are several cases where we can get the size of the result list
upfront, we can certainly do better here now. I'm also -1 to adding
special syntax for populating a list with NULL values, if you really
want to do this (and I doubt it really matters in most cases) calling
PyList_New is the "syntax" to use.

- Robert

From yury at shurup.com  Thu Mar  7 19:26:26 2013
From: yury at shurup.com (Yury V. Zaytsev)
Date: Thu, 07 Mar 2013 19:26:26 +0100
Subject: [Cython] Cython syntax to pre-allocate lists for performance
In-Reply-To: <51387843.602@behnel.de>
References: <1362654970.2849.9.camel@newpride>  <51387843.602@behnel.de>
Message-ID: <1362680786.2664.12.camel@newpride>

Hi Stefan,

On Thu, 2013-03-07 at 12:21 +0100, Stefan Behnel wrote:

> Note that Python has an algorithm for shrinking a list on appending, so
> this might not be sufficient for your use case. What do you need it for?

W00t! I didn't know about that.

I'm wrapping a C++ code that should transmit large lists of objects to
Python, while these objects are stored into something vector-like, which
shouldn't get exposed directly. In the past, they did something like

    obj = PyList_New(a.size());
    for (a.begin(); a.end(); ++a) PyList_SetItem(obj, ...)

I figured I can translate it into a while loop like

    obj = []
    while (it != a.end()):
        obj.append(it)
        inc(it)

but then I'm not using the information about the size of a that I
already have, and for huge lists this tends to be quite slow.

I think this must be quite a common use case for bindings...

> And why won't [None]*N help you out? It should be pretty cheap.

It probably will, at least a bit. It just stroke me that if I'm going to
do something along the lines of

    idx = 0
    obj = [None] * a.size()
    while (it != a.end()):
        obj[idx] = it
        idx += 1
        inc(it)

I could also squeeze the last bits of performance by avoiding the
creation of Nones and subsequently populating the list with them.

If you say I have to use PyList_New directly, oh well... It's just that
now since I'm rewriting the bindings in Cython anyways, I'm also trying
to avoid using Python C API directly as much as possible.

> Won't list comprehensions work for you? They could potentially be adapted
> to presize the list.

I guess not.

-- 
Sincerely yours,
Yury V. Zaytsev


From robertwb at gmail.com  Thu Mar  7 19:44:48 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Thu, 7 Mar 2013 10:44:48 -0800
Subject: [Cython] Cython syntax to pre-allocate lists for performance
In-Reply-To: <1362680786.2664.12.camel@newpride>
References: <1362654970.2849.9.camel@newpride> <51387843.602@behnel.de>
	<1362680786.2664.12.camel@newpride>
Message-ID: <CADiQ+QCuRpaubXbF=pMsRzDWgfcNa_FtL1QBhUhUOB_Noi-GGA@mail.gmail.com>

On Thu, Mar 7, 2013 at 10:26 AM, Yury V. Zaytsev <yury at shurup.com> wrote:
> Hi Stefan,
>
> On Thu, 2013-03-07 at 12:21 +0100, Stefan Behnel wrote:
>
>> Note that Python has an algorithm for shrinking a list on appending, so
>> this might not be sufficient for your use case. What do you need it for?
>
> W00t! I didn't know about that.
>
> I'm wrapping a C++ code that should transmit large lists of objects to
> Python, while these objects are stored into something vector-like, which
> shouldn't get exposed directly. In the past, they did something like
>
>     obj = PyList_New(a.size());
>     for (a.begin(); a.end(); ++a) PyList_SetItem(obj, ...)
>
> I figured I can translate it into a while loop like
>
>     obj = []
>     while (it != a.end()):
>         obj.append(it)
>         inc(it)
>
> but then I'm not using the information about the size of a that I
> already have, and for huge lists this tends to be quite slow.
>
> I think this must be quite a common use case for bindings...
>
>> And why won't [None]*N help you out? It should be pretty cheap.
>
> It probably will, at least a bit. It just stroke me that if I'm going to
> do something along the lines of
>
>     idx = 0
>     obj = [None] * a.size()
>     while (it != a.end()):
>         obj[idx] = it
>         idx += 1
>         inc(it)
>
> I could also squeeze the last bits of performance by avoiding the
> creation of Nones and subsequently populating the list with them.
>
> If you say I have to use PyList_New directly, oh well... It's just that
> now since I'm rewriting the bindings in Cython anyways, I'm also trying
> to avoid using Python C API directly as much as possible.

I would time the two approaches to see if it really matters.

>> Won't list comprehensions work for you? They could potentially be adapted
>> to presize the list.
>
> I guess not.

    [o for o in a]

is nice and clean. If a has a size method (a common stl pattern), we
could optimistically call that to do the pre-allocation.

I don't know exactly what your usecase is, but you might consider
simply exposing a list-like wrapper supporting __getitem__ and
iteration, rather than eagerly converting the entire thing to a list.

- Robert

From greg.ewing at canterbury.ac.nz  Fri Mar  8 00:19:53 2013
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 08 Mar 2013 12:19:53 +1300
Subject: [Cython] Cython syntax to pre-allocate lists for performance
In-Reply-To: <op.wtkrp3vgyabthb@juga>
References: <1362654970.2849.9.camel@newpride> <51387843.602@behnel.de>
	<op.wtkrp3vgyabthb@juga>
Message-ID: <51392099.7040205@canterbury.ac.nz>

Nikita Nemkin wrote:
> Sorry, accidental early send. Previous mail continued...
> 
> [None] * N makes an extra pass over the list to assign None to each item
> (and also incref None n times).

Maybe this could be optimised by adding N to the reference
count instead of incrementing it N times?

-- 
Greg

From robertwb at gmail.com  Fri Mar  8 00:41:54 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Thu, 7 Mar 2013 15:41:54 -0800
Subject: [Cython] Cython syntax to pre-allocate lists for performance
In-Reply-To: <51392099.7040205@canterbury.ac.nz>
References: <1362654970.2849.9.camel@newpride> <51387843.602@behnel.de>
	<op.wtkrp3vgyabthb@juga> <51392099.7040205@canterbury.ac.nz>
Message-ID: <CADiQ+QA1F6sVAuHyQKfRFuO9GMEDCod3bYmzjGa_oz=Pt4qMvw@mail.gmail.com>

On Thu, Mar 7, 2013 at 3:19 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Nikita Nemkin wrote:
>>
>> Sorry, accidental early send. Previous mail continued...
>>
>> [None] * N makes an extra pass over the list to assign None to each item
>> (and also incref None n times).
>
>
> Maybe this could be optimised by adding N to the reference
> count instead of incrementing it N times?

I'd be surprised if the C compiler doesn't.

http://hg.python.org/cpython/file/1d4849f9e37d/Objects/listobject.c#l515

From szport at gmail.com  Fri Mar  8 08:49:47 2013
From: szport at gmail.com (Zaur Shibzukhov)
Date: Fri, 8 Mar 2013 10:49:47 +0300
Subject: [Cython] Add support for list/tuple slicing
In-Reply-To: <CAPOE21T8_Dotxt7J2zcjaFbmn-XObXZ3ojPemGbaKb1O0J-STQ@mail.gmail.com>
References: <CAPOE21T8_Dotxt7J2zcjaFbmn-XObXZ3ojPemGbaKb1O0J-STQ@mail.gmail.com>
Message-ID: <CAPOE21RF_EF05-FMHwm6T7zTYRa9gK0brk3dV8YyXTeBmEwZ2Q@mail.gmail.com>

2013/3/7 Zaur Shibzukhov <szport at gmail.com>:
> Current Cython generate for slicing of list/tuple general
> PySequence_GetSlice/SetSlice call.
> We could replace that to native call for Py{List|Tuple}_GetSlice and
> PyList_SetSlice for lists/tuples.

There is updated change that use utility code
__Pyx_Py{List|Tuple}_GetSlice because Py{List|Tuple}_GetSlice dosn't
support negative indices. That job do (in CPython) {list|tuple}slice
function from type object's slot ({list|tuple}_subscript), but it
handle both indices and slice objects which add overhead. That's the
reason why PySequence_GetSlice is slower: it create slice object and
falls to {list|tuple}_subscript. Therefore I added utility code.

Here is utility code:

/////////////// PyList_GetSlice.proto ///////////////

static PyObject* __Pyx_PyList_GetSlice(
                PyObject* lst, Py_ssize_t start, Py_ssize_t stop);

/////////////// PyList_GetSlice ///////////////

PyObject* __Pyx_PyList_GetSlice(
                PyObject* lst, Py_ssize_t start, Py_ssize_t stop) {
    Py_ssize_t i, length;
    PyListObject* np;
    PyObject **src, **dest;
    PyObject *v;

    length = PyList_GET_SIZE(lst);

    if (start < 0) {
        start += length;
        if (start < 0)
            start = 0;
    }

    if (stop < 0)
        stop += length;
    else if (stop > length)
        stop = length;

    length = stop - start;
    if (length <= 0)
        return PyList_New(0);

    np = (PyListObject*) PyList_New(length);
    if (np == NULL)
        return NULL;

    src = ((PyListObject*)lst)->ob_item + start;
    dest = np->ob_item;
    for (i = 0; i < length; i++) {
        v = src[i];
        Py_INCREF(v);
        dest[i] = v;
    }
    return (PyObject*)np;
}

/////////////// PyTuple_GetSlice.proto ///////////////

static PyObject* __Pyx_PyTuple_GetSlice(
                PyObject* ob, Py_ssize_t start, Py_ssize_t stop);

/////////////// PyTuple_GetSlice ///////////////

PyObject* __Pyx_PyTuple_GetSlice(
                PyObject* ob, Py_ssize_t start, Py_ssize_t stop) {
    Py_ssize_t i, length;
    PyTupleObject* np;
    PyObject **src, **dest;
    PyObject *v;

    length = PyTuple_GET_SIZE(ob);

    if (start < 0) {
        start += length;
        if (start < 0)
            start = 0;
    }

    if (stop < 0)
        stop += length;
    else if (stop > length)
        stop = length;

    length = stop - start;
    if (length <= 0)
        return PyList_New(0);

    np = (PyTupleObject *) PyTuple_New(length);
    if (np == NULL)
        return NULL;

    src = ((PyTupleObject*)ob)->ob_item + start;
    dest = np->ob_item;
    for (i = 0; i < length; i++) {
        v = src[i];
        Py_INCREF(v);
        dest[i] = v;
    }
    return (PyObject*)np;
}

Here is testing code:

list_slice.pyx
-----------------

from cpython.sequence cimport PySequence_GetSlice

cdef extern from "list_tuple_slices.h":
    inline object __Pyx_PyList_GetSlice(object ob, int start, int stop)
    inline object __Pyx_PyTuple_GetSlice(object ob, int start, int stop)


cdef list lst = list(range(10))
cdef list lst2 = list(range(7))

def get_slice1(list lst):
    cdef int i
    cdef list res = []

    for i in range(200000):
        res.append(PySequence_GetSlice(lst, 2, 8))

    return res

def get_slice2(list lst):
    cdef int i
    cdef list res = []

    for i in range(200000):
        res.append(__Pyx_PyList_GetSlice(lst, 2, 8))

    return res

def test_get_slice1():
    get_slice1(lst)

def test_get_slice2():
    get_slice2(lst)

tuple_slicing.pyx
-----------------------

from cpython.sequence cimport PySequence_GetSlice

cdef extern from "list_tuple_slices.h":
    inline object __Pyx_PyList_GetSlice(object lst, int start, int stop)
    inline object __Pyx_PyTuple_GetSlice(object ob, int start, int stop)

cdef tuple lst = tuple(range(10))

def get_slice1(tuple lst):
    cdef int i
    cdef list res = []

    for i in range(200000):
        res.append(PySequence_GetSlice(lst, 2, 8))

    return res

def get_slice2(tuple lst):
    cdef int i
    cdef list res = []

    for i in range(200000):
        res.append(__Pyx_PyTuple_GetSlice(lst, 2, 8))

    return res


def test_get_slice1():
    get_slice1(lst)

def test_get_slice2():
    get_slice2(lst)

Here are timings:

for list

(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.list_slice import test_get_slice1" "test_get_slice1()"
raw times: 10.2 10.3 10.4 10.1 10.2
100 loops, best of 5: 101 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.list_slice import test_get_slice1" "test_get_slice1()"
raw times: 10.3 10.3 10.2 10.3 10.2
100 loops, best of 5: 102 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.list_slice import test_get_slice2" "test_get_slice2()"
raw times: 8.16 8.19 8.17 8.2 8.16
100 loops, best of 5: 81.6 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.list_slice import test_get_slice2" "test_get_slice2()"
raw times: 8.1 8.05 8.03 8.06 8.07
100 loops, best of 5: 80.3 msec per loop

for tuple

(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.tuple_slice import test_get_slice1" "test_get_slice1()"
raw times: 7.2 7.16 7.16 7.18 7.17
100 loops, best of 5: 71.6 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.tuple_slice import test_get_slice1" "test_get_slice1()"
raw times: 7.22 7.22 7.19 7.18 7.18
100 loops, best of 5: 71.8 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.tuple_slice import test_get_slice2" "test_get_slice2()"
raw times: 9.23 5.2 4.95 4.96 4.98
100 loops, best of 5: 49.5 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.tuple_slice import test_get_slice2" "test_get_slice2()"
raw times: 4.92 4.93 4.9 4.94 4.92
100 loops, best of 5: 49 msec per loop

This change dosn't contain list slice assignments because previous
testing and timings showed that this need more analysis.

Maybe I'l make pull request with this change + tests?

Zaur Shibzukhov

From ben.strulo at bt.com  Fri Mar  8 10:25:10 2013
From: ben.strulo at bt.com (ben.strulo at bt.com)
Date: Fri, 8 Mar 2013 09:25:10 +0000
Subject: [Cython] Probably Memory Leak
Message-ID: <ED34290FF9C6E046807AE2A733D4CF9C09D4F44D9E@EMV68-UKRD.domain1.systemhost.net>

Hi there,

I think I may have found a memory leak in cpython.array.  Or I may have screwed up:

I have a test.pyx containing:

#----------------------------

from cpython.array cimport array,clone

cdef class Test(object):
    cdef int[:] myarr

    def __init__(self):
        cdef array templatei = array("i")
        self.myarr = clone(templatei,10000,True)

#----------------------------

Then a test harness which is just:

#----------------------------

import test

i = 0
while True:
    print i
    i += 1
    s = test.Test()

#----------------------------

And this fills memory until I get a MemoryError exception.  I'm using a fresh copy of Cython from Git (unless I messed that up :)) on Windows, compiling with MSVC 9.  Not sure what diagnostics might help but it's a pretty simple test case.

I haven't found a bug in the Cython source but this doesn't seem right.

Hope this is of interest

Ben Strulo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130308/a3c6b468/attachment.html>

From stefan_ml at behnel.de  Fri Mar 15 11:59:14 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 15 Mar 2013 11:59:14 +0100
Subject: [Cython] python-dev discussion on better CPython core-level
	parallelism
Message-ID: <5142FF02.7030001@behnel.de>

http://thread.gmane.org/gmane.comp.python.devel/137858/focus=137858

From stefan_ml at behnel.de  Fri Mar 15 20:52:22 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 15 Mar 2013 20:52:22 +0100
Subject: [Cython] [cython] BUG: Avoid exporting symbols in MemoryView
 utility code. (#197)
In-Reply-To: <cython/cython/pull/197@github.com>
References: <cython/cython/pull/197@github.com>
Message-ID: <51437BF6.9030106@behnel.de>

Hi,

this change revealed that the generated utility code functions are written
into the C code a bit too unconditionally. The numpy_memoryview test shows
lots of C compiler warnings about unused dtype conversion functions now:

https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests/1165/warnings15Result/package.45/file.1478077899/

Stefan


From johntyree at gmail.com  Sat Mar 16 18:16:18 2013
From: johntyree at gmail.com (John Tyree)
Date: Sat, 16 Mar 2013 18:16:18 +0100
Subject: [Cython] Template functions
In-Reply-To: <mailman.43.1363345203.3187.cython-devel@python.org>
References: <mailman.43.1363345203.3187.cython-devel@python.org>
Message-ID: <20130316171618.GA15429@gmail.com>

There is currently a void in Cython's C++ support with respect to function (not
class) templates. It would be great to have such a thing, dangerous or not, so
I'm proposing something to get things rolling.

Given that function templates are 100% transparent to the caller, it seems that
the only barrier is Cython's type system. Even in the easiest case, where the
function returns a known primitive type for all input, we still can't use it.
    
    template<typename T>
    std::string to_string(T a)

    -------

    from libcpp.string import string as cpp_string

    cdef extern from "foo.h" namespace "std":

        cpp_string to_string(??? a, ??? b)
        

We can used fused types if we know that the function is restricted to numeric
types, for example, but in general this is not the case. The only workaround I
currently have is to declare the function N times for N types. This isn't
disastrous, but prevents sharing of code.

As an alternative, what about a dynamic ANY type that uses the fused type
machinery, but always succeeds when specializing? Or perhaps it just shouldn't
be type checked at all? There is always a backend that will generate the type
error and this possibly gives us macro "functions" for free in C.


    cdef extern from "foo.h" namespace "std":

        cpp_string to_string(cython.any_t a, cython.any_t b)


    Pros:
        Huge number of functions become accessible from Cython
        User explicitly states when a type should be unchecked
        Allows mixtures of typed and untyped parameters in a single call

    Cons:
        Makes determining return types hard in some cases.
        Error messages might be difficult to interpret
        ?????
        I'm-sure-this-list-should-be-longer


I'll admit I haven't dug very deep as far as the implications of such a thing.
Is it a reasonable idea? What are the major issues with such an approach?

-John

From robertwb at gmail.com  Sat Mar 16 18:22:08 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Sat, 16 Mar 2013 10:22:08 -0700
Subject: [Cython] Template functions
In-Reply-To: <20130316171618.GA15429@gmail.com>
References: <mailman.43.1363345203.3187.cython-devel@python.org>
	<20130316171618.GA15429@gmail.com>
Message-ID: <CADiQ+QBsaRcOxCLDjfuHHwhtPHbZZ-E9rB7f4+QaE9cAr24JOg@mail.gmail.com>

On Sat, Mar 16, 2013 at 10:16 AM, John Tyree <johntyree at gmail.com> wrote:
> There is currently a void in Cython's C++ support with respect to function (not
> class) templates. It would be great to have such a thing, dangerous or not, so
> I'm proposing something to get things rolling.
>
> Given that function templates are 100% transparent to the caller, it seems that
> the only barrier is Cython's type system. Even in the easiest case, where the
> function returns a known primitive type for all input, we still can't use it.
>
>     template<typename T>
>     std::string to_string(T a)
>
>     -------
>
>     from libcpp.string import string as cpp_string
>
>     cdef extern from "foo.h" namespace "std":
>
>         cpp_string to_string(??? a, ??? b)
>
>
> We can used fused types if we know that the function is restricted to numeric
> types, for example, but in general this is not the case. The only workaround I
> currently have is to declare the function N times for N types. This isn't
> disastrous, but prevents sharing of code.
>
> As an alternative, what about a dynamic ANY type that uses the fused type
> machinery, but always succeeds when specializing? Or perhaps it just shouldn't
> be type checked at all? There is always a backend that will generate the type
> error and this possibly gives us macro "functions" for free in C.
>
>
>     cdef extern from "foo.h" namespace "std":
>
>         cpp_string to_string(cython.any_t a, cython.any_t b)
>
>
>     Pros:
>         Huge number of functions become accessible from Cython
>         User explicitly states when a type should be unchecked
>         Allows mixtures of typed and untyped parameters in a single call
>
>     Cons:
>         Makes determining return types hard in some cases.
>         Error messages might be difficult to interpret
>         ?????
>         I'm-sure-this-list-should-be-longer
>
>
> I'll admit I haven't dug very deep as far as the implications of such a thing.
> Is it a reasonable idea? What are the major issues with such an approach?

I was thinking of something along the lines of

cdef extern from ...:
    cpp_string to_string[T](T value)
    T my_func[T, S](T a, S b)
    ...

It's more a question of how to implement it.

- Robert

From nikita at nemkin.ru  Sat Mar 16 21:39:33 2013
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Sun, 17 Mar 2013 02:39:33 +0600
Subject: [Cython] Minor bug: emitted junk line prevents compilation
Message-ID: <op.wt13n7ijyabthb@juga>

Hi,

I believe I have found a bit of broken/junk code.
This line produces an unpaired and unnecessary #if directive:
https://github.com/cython/cython/blob/master/Cython/Compiler/ModuleNode.py#L2423
The fix is to simply remove it.

In case you are interested in how to hit this line, declare in some .pxd:

     cdef extern from "Python.h":
         ctypedef class __builtin__.BaseException [object  
PyBaseExceptionObject]:
             pass

and cimport it in another .pyx.

Best regards,
Nikita Nemkin

From stefan_ml at behnel.de  Sat Mar 16 21:59:12 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 16 Mar 2013 21:59:12 +0100
Subject: [Cython] Minor bug: emitted junk line prevents compilation
In-Reply-To: <op.wt13n7ijyabthb@juga>
References: <op.wt13n7ijyabthb@juga>
Message-ID: <5144DD20.3050804@behnel.de>

Nikita Nemkin, 16.03.2013 21:39:
> I believe I have found a bit of broken/junk code.
> This line produces an unpaired and unnecessary #if directive:
> https://github.com/cython/cython/blob/master/Cython/Compiler/ModuleNode.py#L2423
> 
> The fix is to simply remove it.

Yes, it's correctly used further down in the code. Thanks!


> In case you are interested in how to hit this line, declare in some .pxd:
> 
>     cdef extern from "Python.h":
>         ctypedef class __builtin__.BaseException [object
> PyBaseExceptionObject]:
>             pass

Why would you need to do that in your code?


> and cimport it in another .pyx.

It's sad that the cross-module importing and C-API code is so badly tested.
Any help to improve this situation will be very warmly appreciated.

Stefan


From nikita at nemkin.ru  Sat Mar 16 22:30:58 2013
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Sun, 17 Mar 2013 03:30:58 +0600
Subject: [Cython] Minor bug: emitted junk line prevents compilation
In-Reply-To: <5144DD20.3050804@behnel.de>
References: <op.wt13n7ijyabthb@juga> <5144DD20.3050804@behnel.de>
Message-ID: <op.wt151whvyabthb@juga>

On Sun, 17 Mar 2013 02:59:12 +0600, Stefan Behnel <stefan_ml at behnel.de>  
wrote:

>> In case you are interested in how to hit this line, declare in some  
>> .pxd:
>>
>>     cdef extern from "Python.h":
>>         ctypedef class __builtin__.BaseException [object  
>> PyBaseExceptionObject]:
>>             pass
>
> Why would you need to do that in your code?

It makes Cython treat BaseException as an extension type
(Exception is declared similarly) and allows for things like:

* Using "Exception" as a type for parameters, attributes, casts.
   All this with Cython-generated and optimized typeckecking.
* Creating Exception subclasses as cdef classes.

It's a hack, but a very useful one.

Best regards,
Nikita Nemkin

From pav at iki.fi  Sun Mar 17 17:15:51 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Sun, 17 Mar 2013 18:15:51 +0200
Subject: [Cython] Refcount error with fused types in classes
Message-ID: <ki4q7k$j5c$1@ger.gmane.org>

Hi,

Here's a snippet demonstrating a refcount error with fused types inside
classes:

---------8<---------
cimport cython

ctypedef fused some_t:
    int
    double

class Foo(object):
    def bar(self, some_t x):
        pass

cdef extern from "Python.h":
    int Py_REFCNT(object)

def main():
    x = Foo()
    print "before:", Py_REFCNT(x)
    x.bar(1.0) # spuriously increments refcount of `x`
    print "after: ", Py_REFCNT(x)
---------8<---------


-- 
Pauli Virtanen


From markflorisson88 at gmail.com  Sun Mar 17 17:51:37 2013
From: markflorisson88 at gmail.com (mark florisson)
Date: Sun, 17 Mar 2013 16:51:37 +0000
Subject: [Cython] Refcount error with fused types in classes
In-Reply-To: <ki4q7k$j5c$1@ger.gmane.org>
References: <ki4q7k$j5c$1@ger.gmane.org>
Message-ID: <CANg26EUNfwNuvVqkuow7w9tt0235rhRQ4BqthfZ7QiZ=p+R_8Q@mail.gmail.com>

On 17 March 2013 16:15, Pauli Virtanen <pav at iki.fi> wrote:
> Hi,
>
> Here's a snippet demonstrating a refcount error with fused types inside
> classes:
>
> ---------8<---------
> cimport cython
>
> ctypedef fused some_t:
>     int
>     double
>
> class Foo(object):
>     def bar(self, some_t x):
>         pass
>
> cdef extern from "Python.h":
>     int Py_REFCNT(object)
>
> def main():
>     x = Foo()
>     print "before:", Py_REFCNT(x)
>     x.bar(1.0) # spuriously increments refcount of `x`
>     print "after: ", Py_REFCNT(x)
> ---------8<---------
>
>
> --
> Pauli Virtanen
>
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel

Thanks, I pushed a fix here: https://github.com/markflorisson88/cython
(fd4853d202b13a92).

From pav at iki.fi  Sun Mar 17 18:18:07 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Sun, 17 Mar 2013 19:18:07 +0200
Subject: [Cython] Refcount error with fused types in classes
In-Reply-To: <CANg26EUNfwNuvVqkuow7w9tt0235rhRQ4BqthfZ7QiZ=p+R_8Q@mail.gmail.com>
References: <ki4q7k$j5c$1@ger.gmane.org>
	<CANg26EUNfwNuvVqkuow7w9tt0235rhRQ4BqthfZ7QiZ=p+R_8Q@mail.gmail.com>
Message-ID: <ki4tsc$jtn$1@ger.gmane.org>

Hi,

17.03.2013 18:51, mark florisson kirjoitti:
[clip]
> Thanks, I pushed a fix here: https://github.com/markflorisson88/cython
> (fd4853d202b13a92).

Thanks. You beat me to this, I just arrived at the same fix :)

Cheers,
Pauli


From pav at iki.fi  Sun Mar 17 18:32:20 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Sun, 17 Mar 2013 19:32:20 +0200
Subject: [Cython] Refcount error with fused types in classes
In-Reply-To: <ki4tsc$jtn$1@ger.gmane.org>
References: <ki4q7k$j5c$1@ger.gmane.org>
	<CANg26EUNfwNuvVqkuow7w9tt0235rhRQ4BqthfZ7QiZ=p+R_8Q@mail.gmail.com>
	<ki4tsc$jtn$1@ger.gmane.org>
Message-ID: <ki4un1$qo7$1@ger.gmane.org>

17.03.2013 19:18, Pauli Virtanen kirjoitti:
> Hi,
> 
> 17.03.2013 18:51, mark florisson kirjoitti:
> [clip]
>> Thanks, I pushed a fix here: https://github.com/markflorisson88/cython
>> (fd4853d202b13a92).
> 
> Thanks. You beat me to this, I just arrived at the same fix :)

Note that the Py_XDECREF(self->__signatures__) needs to be removed from
_dealloc, though.

-- 
Pauli Virtanen


From johntyree at gmail.com  Sun Mar 17 20:15:17 2013
From: johntyree at gmail.com (John Tyree)
Date: Sun, 17 Mar 2013 20:15:17 +0100
Subject: [Cython] Template functions
In-Reply-To: <mailman.29.1363518003.28422.cython-devel@python.org>
References: <mailman.29.1363518003.28422.cython-devel@python.org>
Message-ID: <20130317191517.GA6530@gmail.com>

> 
> I was thinking of something along the lines of
> 
> cdef extern from ...:
>     cpp_string to_string[T](T value)
>     T my_func[T, S](T a, S b)
>     ...
> 
> It's more a question of how to implement it.
> 
> - Robert

Well this closely matches the syntax used for classes and won't require any type
inference, since the user supplies the type at the call site (am I reading that
correctly?) so I'm not sure what about it will be particularly challenging.

If it's done this way the compiler could generate prototypes as necessary in a
preprocessing step, without inferring anything about the types until later when
overloading is resolved. That feels kind of hacky to me, but I've never written
a compiler with the size and scope of Cython, maybe it's not too bad.  This is
essentially what the user has to do already, and it "works".

The biggest complaint I have about this method is that without inference it
looks like it could lead to a *lot* of extra writing out of types. I'm dreading
the thought of writing out nested template types when calling factory functions
like those in the thrust library, which was what motivated this in the first
place.

-John

From nikita at nemkin.ru  Mon Mar 18 14:17:13 2013
From: nikita at nemkin.ru (Nikita Nemkin)
Date: Mon, 18 Mar 2013 19:17:13 +0600
Subject: [Cython] Minor bun in compile time constant handling
Message-ID: <op.wt48izkjyabthb@juga>

Hi,

Here:
https://github.com/cython/cython/blob/master/Cython/Compiler/Parsing.py#L708-L711
compile-time unicode and bytes values should be wrapped
with EncodedString and BytesLiteral respectively:

     elif isinstance(value, _unicode):
         return ExprNodes.UnicodeNode(pos, value=EncodedString(value))
     elif isinstance(value, _bytes):
         return ExprNodes.BytesNode(pos, value=BytesLiteral(value))

Otherwise attempts to use compile-time strings in Python context result
in errors like "AttributeError: 'unicode' object has no attribute  
'is_unicode'".

Best regards,
Nikita Nemkin

From robertwb at gmail.com  Mon Mar 18 17:43:14 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Mon, 18 Mar 2013 09:43:14 -0700
Subject: [Cython] Template functions
In-Reply-To: <20130317191517.GA6530@gmail.com>
References: <mailman.29.1363518003.28422.cython-devel@python.org>
	<20130317191517.GA6530@gmail.com>
Message-ID: <CADiQ+QDKSYGQRQDf7UNjhzUY2Vi-4dmEhWLWJhDZ4uat-mEkgw@mail.gmail.com>

On Sun, Mar 17, 2013 at 12:15 PM, John Tyree <johntyree at gmail.com> wrote:
>>
>> I was thinking of something along the lines of
>>
>> cdef extern from ...:
>>     cpp_string to_string[T](T value)
>>     T my_func[T, S](T a, S b)
>>     ...
>>
>> It's more a question of how to implement it.
>>
>> - Robert
>
> Well this closely matches the syntax used for classes and won't require any type
> inference, since the user supplies the type at the call site (am I reading that
> correctly?) so I'm not sure what about it will be particularly challenging.
>
> If it's done this way the compiler could generate prototypes as necessary in a
> preprocessing step, without inferring anything about the types until later when
> overloading is resolved. That feels kind of hacky to me, but I've never written
> a compiler with the size and scope of Cython, maybe it's not too bad.  This is
> essentially what the user has to do already, and it "works".
>
> The biggest complaint I have about this method is that without inference it
> looks like it could lead to a *lot* of extra writing out of types. I'm dreading
> the thought of writing out nested template types when calling factory functions
> like those in the thrust library, which was what motivated this in the first
> place.

I think we need something to constrain the argument types (e.g. as
they relate to each other), as well as provide a return type. The
"any" type seems to lead way to easily to incorrect code, as well as
surprises, e.g. is the "object" type accepted? (FWIW, I was thinking
of allowing inference, that'll actually be pretty easy once the rest
is in place.)

- Robert

From stefan_ml at behnel.de  Wed Mar 20 07:47:11 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 20 Mar 2013 07:47:11 +0100
Subject: [Cython] Jenkins down
Message-ID: <51495B6F.5080607@behnel.de>

Hi,

just to let you know that sage.math (where our Jenkins instance runs) has
been down for a couple of days already and it's currently unclear when it
will be back. Most likely during the next days, though. Note that this
means that we have lost all work spaces and git caches, so the first builds
will take somewhat longer than normal once we get it restarted (assuming
that everything comes up nicely in the first place...).

Stefan

From volker.mische at gmail.com  Fri Mar 22 14:47:52 2013
From: volker.mische at gmail.com (Volker Mische)
Date: Fri, 22 Mar 2013 14:47:52 +0100
Subject: [Cython] Constant pointers not working
Message-ID: <514C6108.5040709@gmail.com>

Hi all,

I was excited to see that 'const' is finally supported, but constant
pointers are not. Here's an example with the corresponding error:

Error compiling Cython file:
------------------------------------------------------------
...
cdef extern int foo(const int *const bar)
                                    ^
------------------------------------------------------------

const.pxd:1:37: Expected ')', found 'bar'

Cheers,
  Volker

From matej at laitl.cz  Sat Mar 23 23:57:51 2013
From: matej at laitl.cz (=?utf-8?B?TWF0xJtq?= Laitl)
Date: Sat, 23 Mar 2013 23:57:51 +0100
Subject: [Cython] view[0].methodcall() produces invalid C when view is
	memoryview of extension type
Message-ID: <1401112.QzbjNdWC9X@edgy>

Hi,
following test code produces C code that fails to compile:
> cdef class ExtensionType(object):
>     cdef public int dummy
>     
>     def __init__(self, n):
>         self.dummy = n
>     
>     cdef cfoo(self):
>         print self.dummy
> 
> items = [ExtensionType(1), ExtensionType(2)]
> cdef ExtensionType[:] view = np.array(items, dtype=ExtensionType)
> view[0].cfoo()

with gcc error and relevant C file lines:
extension_type_memoryview.c:2604:94: error: ?PyObject? has no member named 
?__pyx_vtab?

2570:  PyObject *__pyx_t_1 = NULL;
(...)
2601:  __pyx_t_1 = (PyObject *) *((struct 
    :__pyx_obj_25extension_type_memoryview_ExtensionType * *) ( /* dim=0 */
    : (__pyx_v_25extension_type_memoryview_view.data + __pyx_t_2 *
    : __pyx_v_25extension_type_memoryview_view.strides[0]) ));
2602:  __Pyx_INCREF((PyObject*)__pyx_t_1);
2603:  /* __pyx_t_4 allocated */
2604:  __pyx_t_4 = ((struct
    : __pyx_vtabstruct_25extension_type_memoryview_ExtensionType *)__pyx_t_1
    :->__pyx_vtab)->cfoo(__pyx_t_1); if (unlikely(!__pyx_t_4)) {__pyx_filename
    : = __pyx_f[0]; __pyx_lineno = 69;

It seems that generic PyObject* temporary for __pyx_t_1 is used here while 
typed ExtensionType* temporary should have been used instead (as suggested by 
excess casting on line 2601).

I have the above test-case (and a bit more) prepared as a patch that I'll 
pull-request one this seemingly trivial bug is fixed in git.

Regards,
		Mat?j

From stefan_ml at behnel.de  Sun Mar 24 17:29:00 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 24 Mar 2013 17:29:00 +0100
Subject: [Cython] Jenkins down
In-Reply-To: <51495B6F.5080607@behnel.de>
References: <51495B6F.5080607@behnel.de>
Message-ID: <514F29CC.8030005@behnel.de>

Stefan Behnel, 20.03.2013 07:47:
> just to let you know that sage.math (where our Jenkins instance runs) has
> been down for a couple of days already and it's currently unclear when it
> will be back. Most likely during the next days, though. Note that this
> means that we have lost all work spaces and git caches, so the first builds
> will take somewhat longer than normal once we get it restarted (assuming
> that everything comes up nicely in the first place...).

I started Jenkins on the boxen server (which the sage.math DNS entry
currently forwards to), running on a local disk AFAICT. It looks ok so far,
although I had to disable the 32bit tests due to missing Ubuntu packages.
It will also be generally slower than on sage.math, because we no longer
have a ramdisk to put the workspaces into. But I think it's most important
to have it back alive at all.

Stefan


From stefan_ml at behnel.de  Sun Mar 24 18:02:33 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 24 Mar 2013 18:02:33 +0100
Subject: [Cython] timeframe for 0.19?
Message-ID: <514F31A9.4050103@behnel.de>

Hi,

the current master has collected quite a number of improvements and I think
we should try to get them out of the door. Any objections to starting with
the preparations in the first week of April?

Stefan

From wstein at gmail.com  Sun Mar 24 19:21:23 2013
From: wstein at gmail.com (William Stein)
Date: Sun, 24 Mar 2013 11:21:23 -0700
Subject: [Cython] Jenkins down
In-Reply-To: <514F29CC.8030005@behnel.de>
References: <51495B6F.5080607@behnel.de> <514F29CC.8030005@behnel.de>
Message-ID: <CACLE5GBU96wwVaE4=axhWyc4jmf8GFh9X5vcV59H3U__y0MD7g@mail.gmail.com>

On Sun, Mar 24, 2013 at 9:29 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Stefan Behnel, 20.03.2013 07:47:
>> just to let you know that sage.math (where our Jenkins instance runs) has
>> been down for a couple of days already and it's currently unclear when it
>> will be back. Most likely during the next days, though. Note that this
>> means that we have lost all work spaces and git caches, so the first builds
>> will take somewhat longer than normal once we get it restarted (assuming
>> that everything comes up nicely in the first place...).
>
> I started Jenkins on the boxen server (which the sage.math DNS entry
> currently forwards to), running on a local disk AFAICT. It looks ok so far,
> although I had to disable the 32bit tests due to missing Ubuntu packages.

Which packages?  I can install them easily.

> It will also be generally slower than on sage.math, because we no longer
> have a ramdisk to put the workspaces into. But I think it's most important
> to have it back alive at all.

>
> Stefan
>
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel


-- 
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

From stefan_ml at behnel.de  Sun Mar 24 20:14:32 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 24 Mar 2013 20:14:32 +0100
Subject: [Cython] Jenkins down
In-Reply-To: <CACLE5GBU96wwVaE4=axhWyc4jmf8GFh9X5vcV59H3U__y0MD7g@mail.gmail.com>
References: <51495B6F.5080607@behnel.de> <514F29CC.8030005@behnel.de>
	<CACLE5GBU96wwVaE4=axhWyc4jmf8GFh9X5vcV59H3U__y0MD7g@mail.gmail.com>
Message-ID: <514F5098.8090100@behnel.de>

William Stein, 24.03.2013 19:21:
> On Sun, Mar 24, 2013 at 9:29 AM, Stefan Behnel wrote:
>> I started Jenkins on the boxen server (which the sage.math DNS entry
>> currently forwards to), running on a local disk AFAICT. It looks ok so far,
>> although I had to disable the 32bit tests due to missing Ubuntu packages.
> 
> Which packages?  I can install them easily.

Thanks - I would have asked if I knew which ones. I couldn't look into this
yet, just noticed that the 32bit builds didn't work "for some reason".
Basically, sage.math had a (mostly) working 32bit gcc build environment,
but I'll have to see what that included.

Stefan


From robertwb at gmail.com  Mon Mar 25 18:43:23 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Mon, 25 Mar 2013 10:43:23 -0700
Subject: [Cython] timeframe for 0.19?
In-Reply-To: <514F31A9.4050103@behnel.de>
References: <514F31A9.4050103@behnel.de>
Message-ID: <CADiQ+QDWYjLMZ8OY1Hn0orOkBce=vfJ+QgM+ZBdeH-szGMkZNg@mail.gmail.com>

Sound good to me.

On Sun, Mar 24, 2013 at 10:02 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Hi,
>
> the current master has collected quite a number of improvements and I think
> we should try to get them out of the door. Any objections to starting with
> the preparations in the first week of April?
>
> Stefan
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel

From Martin.Fiers at intec.ugent.be  Tue Mar 26 10:52:02 2013
From: Martin.Fiers at intec.ugent.be (Martin Fiers)
Date: Tue, 26 Mar 2013 10:52:02 +0100
Subject: [Cython] Bug: Returning real value crashes the code,
	complex value does not
Message-ID: <51516FC2.1090802@intec.ugent.be>

Dear Cython developers,

I stumbled upon a strange error when using Cython. I made a minimal 
working example, see attachment for the two necessary files. (btw I 
didn't find the e-mail address of Robert Bradshaw so I could not request 
him for an account on the issue tracker. Is it possible to put the bug 
on there?)

To reproduce the bug:
1) Reboot to Windows :) (the bug only appears on Windows)
2) Run compile_bug.py to generate the Cython extension
3) Try to run the my_func_exposed function:

python
 >>> import complex_double
(does not crash)
 >>> complex_double.my_func_exposed(1,1j)
(crashes)
 >>> complex_double.my_func_exposed(1,1)

If I put a breakpoint in the code with gdb, jump in the code, and leave 
the function again, it does not crash! Also, it is no problem on Linux.

It has to do with the fact that in the first case, a real value was 
used. In the complex-value case, it does not crash. I went through the 
generated cpp file and I don't see any issues there (the reason I use 
cpp is because it's in a big project that needs cpp enabled; it is 
further linked and so on).

gcc version used: 4.6.2 (mingw)
cython version used: 0.18 (I did pip install Cython)
python version used: python 2.7.3 (MSC v.1500 32 bit).

Looking forward to hearing from you!

With kind regards,
Martin

-- 
-----------------------------------------------------------

ir. Martin Fiers

Photonics Research Group
Universiteit Gent - Ghent University

Sint-Pietersnieuwstraat 41
9000 Gent - Belgium

T + 32 9 264 34 48
E martin.fiers at intec.ugent.be
W www.caphesim.com

-----------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: compile_bug.py
Type: text/x-python
Size: 1087 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20130326/a5db4a3e/attachment.py>
-------------- next part --------------
import cython

@cython.cdivision(True)
cdef public double complex my_func(int a, b): 

    if(a==0):
        return 1;    
    else:
        return b;

def my_func_exposed(int a, b):
    return my_func(a,b)

From robertwb at gmail.com  Tue Mar 26 18:48:43 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Tue, 26 Mar 2013 10:48:43 -0700
Subject: [Cython] Bug: Returning real value crashes the code,
 complex value does not
In-Reply-To: <51516FC2.1090802@intec.ugent.be>
References: <51516FC2.1090802@intec.ugent.be>
Message-ID: <CADiQ+QBARGSA=unE=_si2A2_2Jx+jwQfP8h7wNRSP+YXSwGM3Q@mail.gmail.com>

On Tue, Mar 26, 2013 at 2:52 AM, Martin Fiers
<Martin.Fiers at intec.ugent.be> wrote:
> Dear Cython developers,
>
> I stumbled upon a strange error when using Cython. I made a minimal working
> example, see attachment for the two necessary files. (btw I didn't find the
> e-mail address of Robert Bradshaw so I could not request him for an account
> on the issue tracker. Is it possible to put the bug on there?)

Sure. You should have my email now.

> To reproduce the bug:
> 1) Reboot to Windows :) (the bug only appears on Windows)
> 2) Run compile_bug.py to generate the Cython extension
> 3) Try to run the my_func_exposed function:
>
> python
>>>> import complex_double
> (does not crash)
>>>> complex_double.my_func_exposed(1,1j)
> (crashes)
>>>> complex_double.my_func_exposed(1,1)
>
> If I put a breakpoint in the code with gdb, jump in the code, and leave the
> function again, it does not crash! Also, it is no problem on Linux.
>
> It has to do with the fact that in the first case, a real value was used. In
> the complex-value case, it does not crash. I went through the generated cpp
> file and I don't see any issues there (the reason I use cpp is because it's
> in a big project that needs cpp enabled; it is further linked and so on).
>
> gcc version used: 4.6.2 (mingw)
> cython version used: 0.18 (I did pip install Cython)
> python version used: python 2.7.3 (MSC v.1500 32 bit).

Very strange. Does calling PyComplex_AsCComplex directly produce the
same crash? What about

    cdef complex double x = 1.0

or

    cdef object py_x = 1.0
    cdef complex double x = py_x

?

- Robert

From Martin.Fiers at intec.ugent.be  Wed Mar 27 01:12:13 2013
From: Martin.Fiers at intec.ugent.be (Martin Fiers)
Date: Wed, 27 Mar 2013 01:12:13 +0100
Subject: [Cython] Bug: Returning real value crashes the code,
 complex value does not
In-Reply-To: <CADiQ+QBARGSA=unE=_si2A2_2Jx+jwQfP8h7wNRSP+YXSwGM3Q@mail.gmail.com>
References: <51516FC2.1090802@intec.ugent.be>
	<CADiQ+QBARGSA=unE=_si2A2_2Jx+jwQfP8h7wNRSP+YXSwGM3Q@mail.gmail.com>
Message-ID: <5152395D.4000408@intec.ugent.be>


On 3/26/2013 6:48 PM, Robert Bradshaw wrote:
> On Tue, Mar 26, 2013 at 2:52 AM, Martin Fiers
> <Martin.Fiers at intec.ugent.be> wrote:
>> Dear Cython developers,
>>
>> I stumbled upon a strange error when using Cython. I made a minimal working
>> example, see attachment for the two necessary files. (btw I didn't find the
>> e-mail address of Robert Bradshaw so I could not request him for an account
>> on the issue tracker. Is it possible to put the bug on there?)
> Sure. You should have my email now.
Thank you! I just sent a mail.

Also, thanks for replying so quickly. Replies follow inline.
>
>> To reproduce the bug:
>> 1) Reboot to Windows :) (the bug only appears on Windows)
>> 2) Run compile_bug.py to generate the Cython extension
>> 3) Try to run the my_func_exposed function:
>>
>> python
>>>>> import complex_double
>> (does not crash)
>>>>> complex_double.my_func_exposed(1,1j)
>> (crashes)
>>>>> complex_double.my_func_exposed(1,1)
>> If I put a breakpoint in the code with gdb, jump in the code, and leave the
>> function again, it does not crash! Also, it is no problem on Linux.
>>
>> It has to do with the fact that in the first case, a real value was used. In
>> the complex-value case, it does not crash. I went through the generated cpp
>> file and I don't see any issues there (the reason I use cpp is because it's
>> in a big project that needs cpp enabled; it is further linked and so on).
>>
>> gcc version used: 4.6.2 (mingw)
>> cython version used: 0.18 (I did pip install Cython)
>> python version used: python 2.7.3 (MSC v.1500 32 bit).
> Very strange. Does calling PyComplex_AsCComplex directly produce the
> same crash? What about
I'm not sure how to call this directly. Do you mean by modifying the 
generated cpp file and then manually building an extension module?
>
>      cdef complex double x = 1.0
This one works.
>
> or
>
>      cdef object py_x = 1.0
>      cdef complex double x = py_x
This one crashes!

Regards,
Martin
> ?
>
> - Robert
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel


-- 
-----------------------------------------------------------

ir. Martin Fiers

Photonics Research Group
Universiteit Gent - Ghent University

Sint-Pietersnieuwstraat 41
9000 Gent - Belgium

T + 32 9 264 34 48
E martin.fiers at intec.ugent.be
W www.caphesim.com

-----------------------------------------------------------


From robertwb at gmail.com  Wed Mar 27 22:36:12 2013
From: robertwb at gmail.com (Robert Bradshaw)
Date: Wed, 27 Mar 2013 14:36:12 -0700
Subject: [Cython] Bug: Returning real value crashes the code,
 complex value does not
In-Reply-To: <5152BF75.5030708@intec.ugent.be>
References: <51516FC2.1090802@intec.ugent.be>
	<CADiQ+QBARGSA=unE=_si2A2_2Jx+jwQfP8h7wNRSP+YXSwGM3Q@mail.gmail.com>
	<5152395D.4000408@intec.ugent.be>
	<CADiQ+QALD0mS6ZaY_uObDZjDz8kyZkLUBU8ncb297EpvDPK1jg@mail.gmail.com>
	<5152BF75.5030708@intec.ugent.be>
Message-ID: <CADiQ+QCEGeLJV5pQjtFMs2N5+T3L_iFhDuHLhrdhV7zOxur4OA@mail.gmail.com>

On Wed, Mar 27, 2013 at 2:44 AM, Martin Fiers
<Martin.Fiers at intec.ugent.be> wrote:
> On 3/27/2013 3:54 AM, Robert Bradshaw wrote:
>>
>> On Tue, Mar 26, 2013 at 5:12 PM, Martin Fiers
>> <Martin.Fiers at intec.ugent.be> wrote:
>>>
>>> On 3/26/2013 6:48 PM, Robert Bradshaw wrote:
>>>>
>>>> On Tue, Mar 26, 2013 at 2:52 AM, Martin Fiers
>>>> <Martin.Fiers at intec.ugent.be> wrote:
>>>>>
>>>>> Dear Cython developers,
>>>>>
>>>>> I stumbled upon a strange error when using Cython. I made a minimal
>>>>> working
>>>>> example, see attachment for the two necessary files. (btw I didn't find
>>>>> the
>>>>> e-mail address of Robert Bradshaw so I could not request him for an
>>>>> account
>>>>> on the issue tracker. Is it possible to put the bug on there?)
>>>>
>>>> Sure. You should have my email now.
>>>
>>> Thank you! I just sent a mail.
>>>
>>> Also, thanks for replying so quickly. Replies follow inline.
>>>
>>>>> To reproduce the bug:
>>>>> 1) Reboot to Windows :) (the bug only appears on Windows)
>>>>> 2) Run compile_bug.py to generate the Cython extension
>>>>> 3) Try to run the my_func_exposed function:
>>>>>
>>>>> python
>>>>>>>>
>>>>>>>> import complex_double
>>>>>
>>>>> (does not crash)
>>>>>>>>
>>>>>>>> complex_double.my_func_exposed(1,1j)
>>>>>
>>>>> (crashes)
>>>>>>>>
>>>>>>>> complex_double.my_func_exposed(1,1)
>>>>>
>>>>> If I put a breakpoint in the code with gdb, jump in the code, and leave
>>>>> the
>>>>> function again, it does not crash! Also, it is no problem on Linux.
>>>>>
>>>>> It has to do with the fact that in the first case, a real value was
>>>>> used.
>>>>> In
>>>>> the complex-value case, it does not crash. I went through the generated
>>>>> cpp
>>>>> file and I don't see any issues there (the reason I use cpp is because
>>>>> it's
>>>>> in a big project that needs cpp enabled; it is further linked and so
>>>>> on).
>>>>>
>>>>> gcc version used: 4.6.2 (mingw)
>>>>> cython version used: 0.18 (I did pip install Cython)
>>>>> python version used: python 2.7.3 (MSC v.1500 32 bit).
>>>>
>>>> Very strange. Does calling PyComplex_AsCComplex directly produce the
>>>> same crash? What about
>>>
>>> I'm not sure how to call this directly. Do you mean by modifying the
>>> generated cpp file and then manually building an extension module?
>>>
>>>>       cdef complex double x = 1.0
>>>
>>> This one works.
>>>
>>>> or
>>>>
>>>>       cdef object py_x = 1.0
>>>>       cdef complex double x = py_x
>>>
>>> This one crashes!
>>
>> Ah. Try
>>
>>      from cpython.complex cimport Py_complex, PyComplex_AsCComplex
>>      cdef Py_complex x = PyComplex_AsCComplex(py_x)
>>      print x.real, x.imag
>
> Ok. I tried this, and it also crashes. Here's the modification:
>
> from cpython.complex cimport Py_complex
> from cpython.complex cimport PyComplex_AsCComplex
>
> @cython.cdivision(True)
> cdef public double complex my_func(int a, b):
>
>     cdef object py_x = 1.0
>
>     #cdef double complex x = 1.0                    # Does not crash
>     #cdef double complex x2 = py_x                  # Crashes for py_x = 1,
> not for py_x=1j.
>     #cdef Py_complex x = PyComplex_AsCComplex(py_x) # Crashes, even for
> py_x=1j
>     #print x.real, x.imag
>
> And as you can see, the PyComplex_AsCComplex also crashes (SIGSEGV).
> I tried to compile with debug information, as in the instructions in
> http://docs.cython.org/src/userguide/debugging.html
> But I cannot get the line numbers. Probably I need a debug-python version,
> but that seems to be very nontrivial on Windows.
>
> Not sure if I can think of other options to test it and/or track down the
> bug...
>
> Now it even crashes when py_x = 1j. So maybe there's something else going
> wrong here too.

I wonder if it's a compiler miss-match or something like that.

> Regards,
> Martin
>
> P.S. I only replied to you because you didn't put the
> cython-devel at python.org in the previous mail.

Oops. Un-intentional oversight.

From dalcinl at gmail.com  Fri Mar 29 20:20:12 2013
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Fri, 29 Mar 2013 22:20:12 +0300
Subject: [Cython] Commit f2a4b09b broke petsc4py
Message-ID: <CAEcYPwB=vJ1D8EKWezT9TpqezO8Hs1aZLCKyQBK3j+Sjp9P8=A@mail.gmail.com>

https://github.com/cython/cython/commit/f2a4b09b94dc0783625dc869af0880742c29f58d

I could not figure out how to fix it, but the following patch to the
test case reproduces the problem:


diff --git a/tests/run/tp_new_cimport.srctree b/tests/run/tp_new_cimport.srctree
index d60d712..632172c 100644
--- a/tests/run/tp_new_cimport.srctree
+++ b/tests/run/tp_new_cimport.srctree
@@ -42,7 +42,7 @@ def test_sub():

 ######## a.pxd ########

-cdef class ExtTypeA:
+cdef api class ExtTypeA[type ExtTypeA_Type, object ExtTypeAObject]:
     cdef readonly attrA

 ######## a.pyx ########


--
Lisandro Dalcin
---------------
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169

From stefan_ml at behnel.de  Fri Mar 29 21:23:43 2013
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 29 Mar 2013 21:23:43 +0100
Subject: [Cython] Commit f2a4b09b broke petsc4py
In-Reply-To: <CAEcYPwB=vJ1D8EKWezT9TpqezO8Hs1aZLCKyQBK3j+Sjp9P8=A@mail.gmail.com>
References: <CAEcYPwB=vJ1D8EKWezT9TpqezO8Hs1aZLCKyQBK3j+Sjp9P8=A@mail.gmail.com>
Message-ID: <5155F84F.3000102@behnel.de>

Hi Lisandro!

Lisandro Dalcin, 29.03.2013 20:20:
> https://github.com/cython/cython/commit/f2a4b09b94dc0783625dc869af0880742c29f58d
> 
> I could not figure out how to fix it, but the following patch to the
> test case reproduces the problem:
> 
> 
> diff --git a/tests/run/tp_new_cimport.srctree b/tests/run/tp_new_cimport.srctree
> index d60d712..632172c 100644
> --- a/tests/run/tp_new_cimport.srctree
> +++ b/tests/run/tp_new_cimport.srctree
> @@ -42,7 +42,7 @@ def test_sub():
> 
>  ######## a.pxd ########
> 
> -cdef class ExtTypeA:
> +cdef api class ExtTypeA[type ExtTypeA_Type, object ExtTypeAObject]:
>      cdef readonly attrA
> 
>  ######## a.pyx ########

Hmm, yes, that's not obvious to me either. I pushed a quick fix, but I'm
sure there's a cleaner way to do this. (And if there isn't, there should be
one...)

https://github.com/cython/cython/commit/3257193a7865c1f45ac2479954be5569f0b8337e

Stefan