[Cython] memoryview slices can't be None?

mark florisson markflorisson88 at gmail.com
Fri Feb 3 19:07:09 CET 2012


On 3 February 2012 18:06, mark florisson <markflorisson88 at gmail.com> wrote:
> On 3 February 2012 17:53, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no> wrote:
>> On 02/03/2012 12:09 AM, mark florisson wrote:
>>>
>>> On 2 February 2012 21:38, Dag Sverre Seljebotn
>>> <d.s.seljebotn at astro.uio.no>  wrote:
>>>>
>>>> On 02/02/2012 10:16 PM, mark florisson wrote:
>>>>>
>>>>>
>>>>> On 2 February 2012 12:19, Dag Sverre Seljebotn
>>>>> <d.s.seljebotn at astro.uio.no>    wrote:
>>>>>>
>>>>>>
>>>>>> I just realized that
>>>>>>
>>>>>> cdef int[:] a = None
>>>>>>
>>>>>> raises an exception; even though I'd argue that 'a' is of the
>>>>>> "reference"
>>>>>> kind of type where Cython usually allow None (i.e., "cdef MyClass b =
>>>>>> None"
>>>>>> is allowed even if type(None) is NoneType). Is this a bug or not, and
>>>>>> is
>>>>>> it
>>>>>> possible to do something about it?
>>>>>>
>>>>>> Dag Sverre
>>>>>> _______________________________________________
>>>>>> cython-devel mailing list
>>>>>> cython-devel at python.org
>>>>>> http://mail.python.org/mailman/listinfo/cython-devel
>>>>>
>>>>>
>>>>>
>>>>> Yeah I disabled that quite early. It was supposed to be working but
>>>>> gave a lot of trouble in cases (segfaults, mainly). At the time I was
>>>>> trying to get rid of all the segfaults and get the basic functionality
>>>>> working, so I disabled it. Personally, I have never liked how things
>>>>
>>>>
>>>>
>>>> Well, you can segfault quite easily with
>>>>
>>>> cdef MyClass a = None
>>>> print a.field
>>>>
>>>> so it doesn't make sense to slices different from cdef classes IMO.
>>>>
>>>>
>>>>> can be None unchecked. I personally prefer to write
>>>>>
>>>>> cdef foo(obj=None):
>>>>>     cdef int[:] a
>>>>>     if obj is None:
>>>>>         obj = ...
>>>>>     a = obj
>>>>>
>>>>> Often you forget to write 'not None' when declaring the parameter (and
>>>>> apparently that it only allowed for 'def' functions).
>>>>>
>>>>> As such, I never bothered to re-enable it. However, it does support
>>>>> control flow with uninitialized slices, and will raise an error if it
>>>>> is uninitialized. Do we want this behaviour (e.g. for consistency)?
>>>>
>>>>
>>>>
>>>> When in doubt, go for consistency. So +1 for that reason. I do believe
>>>> that
>>>> setting stuff to None is rather vital in Python.
>>>>
>>>> What I typically do is more like this:
>>>>
>>>> def f(double[:] input, double[:] out=None):
>>>>    if out is None:
>>>>        out = np.empty_like(input)
>>>>    ...
>>>>
>>>> Having to use another variable name is a bit of a pain. (Come on -- do
>>>> you
>>>> use "a" in real code? What do you actually call "the other obj"? I
>>>> sometimes
>>>> end up with "out_" and so on, but it creates smelly code quite quickly.)
>>>
>>>
>>> No, it was just a contrived example.
>>>
>>>> It's easy to segfault with cdef classes anyway, so decent nonechecking
>>>> should be implemented at some point, and then memoryviews would use the
>>>> same
>>>> mechanisms. Java has decent null-checking...
>>>>
>>>
>>> The problem with none checking is that it has to occur at every point.
>>
>>
>> Well, using control flow analysis etc. it doesn't really. E.g.,
>>
>> for i in range(a.shape[0]):
>>    print i
>>    a[i] *= 3
>>
>> can be unrolled and none-checks inserted as
>>
>> print 0
>> if a is None: raise ....
>> a[0] *= 3
>> for i in range(1, a.shape[0]):
>>    print i
>>    a[i] *= 3 # no need for none-check
>>
>> It's very similar to what you'd want to do to pull boundschecking out of the
>> loop...
>>
>
> Oh, definitely. Both optimizations may not always be possible to do,
> though. The optimization (for boundschecking) is easier for prange()
> than range(), as you can immediately raise an exception as the
> exceptional condition may be issued at any iteration.  What do you do
> with bounds checking when some accesses are in-bound, and some are
> out-of-bound? Do you immediately raise the exception? Are we fine with
> aborting (like Fortran compilers do when you ask them for bounds
> checking)? And how do you detect that the code doesn't already raise
> an exception or break out of the loop itself to prevent the
> out-of-bound access? (Unless no exceptions are propagating and no
> break/return is used, but exceptions are so very common).
>
>>> With initialized slices the control flow knows when the slices are
>>> initialized, or when they might not be (and it can raise a
>>> compile-time or runtime error, instead of a segfault if you're lucky).
>>> I'm fine with implementing the behaviour, I just always left it at the
>>> bottom of my todo list.
>>
>>
>> Wasn't saying you should do it, just checking.
>>
>> I'm still not sure about this. I think what I'd really like is
>>
>>  a) Stop cdef classes from being None as well
>>
>>  b) Sort-of deprecate cdef in favor of cast/assertion type statements that
>> help the type inferences:
>>
>> def f(arr):
>>    if arr is None:
>>        arr = ...
>>    arr = int[:](arr) # equivalent to "cdef int[:] arr = arr", but
>>                      # acts as statement, with a specific point
>>                      # for the none-check
>>    ...
>>
>> or even:
>>
>> def f(arr):
>>    if arr is None:
>>        return 'foo'
>>    else:
>>        arr = int[:](arr) # takes effect *here*, does none-check
>>        ...
>>    # arr still typed as int[:] here
>>
>> If we can make this work well enough with control flow analysis I'd never
>> cdef declare local vars again :-)
>
> Hm, what about the following?
>
> def f(arr):
>    if arr is None:
>        return 'foo'
>
>    cdef int[:] arr # arr may not be None

The above would work in general, until the declaration is lexically
encountered, the object is typed as object.

>> Dag
>>
>> _______________________________________________
>> cython-devel mailing list
>> cython-devel at python.org
>> http://mail.python.org/mailman/listinfo/cython-devel


More information about the cython-devel mailing list