[Cython] memoryview slices can't be None?

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Fri Feb 3 19:15:07 CET 2012


On 02/03/2012 07:07 PM, mark florisson wrote:
> On 3 February 2012 18:06, mark florisson<markflorisson88 at gmail.com>  wrote:
>> On 3 February 2012 17:53, Dag Sverre Seljebotn
>> <d.s.seljebotn at astro.uio.no>  wrote:
>>> On 02/03/2012 12:09 AM, mark florisson wrote:
>>>>
>>>> On 2 February 2012 21:38, Dag Sverre Seljebotn
>>>> <d.s.seljebotn at astro.uio.no>    wrote:
>>>>>
>>>>> On 02/02/2012 10:16 PM, mark florisson wrote:
>>>>>>
>>>>>>
>>>>>> On 2 February 2012 12:19, Dag Sverre Seljebotn
>>>>>> <d.s.seljebotn at astro.uio.no>      wrote:
>>>>>>>
>>>>>>>
>>>>>>> I just realized that
>>>>>>>
>>>>>>> cdef int[:] a = None
>>>>>>>
>>>>>>> raises an exception; even though I'd argue that 'a' is of the
>>>>>>> "reference"
>>>>>>> kind of type where Cython usually allow None (i.e., "cdef MyClass b =
>>>>>>> None"
>>>>>>> is allowed even if type(None) is NoneType). Is this a bug or not, and
>>>>>>> is
>>>>>>> it
>>>>>>> possible to do something about it?
>>>>>>>
>>>>>>> Dag Sverre
>>>>>>> _______________________________________________
>>>>>>> cython-devel mailing list
>>>>>>> cython-devel at python.org
>>>>>>> http://mail.python.org/mailman/listinfo/cython-devel
>>>>>>
>>>>>>
>>>>>>
>>>>>> Yeah I disabled that quite early. It was supposed to be working but
>>>>>> gave a lot of trouble in cases (segfaults, mainly). At the time I was
>>>>>> trying to get rid of all the segfaults and get the basic functionality
>>>>>> working, so I disabled it. Personally, I have never liked how things
>>>>>
>>>>>
>>>>>
>>>>> Well, you can segfault quite easily with
>>>>>
>>>>> cdef MyClass a = None
>>>>> print a.field
>>>>>
>>>>> so it doesn't make sense to slices different from cdef classes IMO.
>>>>>
>>>>>
>>>>>> can be None unchecked. I personally prefer to write
>>>>>>
>>>>>> cdef foo(obj=None):
>>>>>>      cdef int[:] a
>>>>>>      if obj is None:
>>>>>>          obj = ...
>>>>>>      a = obj
>>>>>>
>>>>>> Often you forget to write 'not None' when declaring the parameter (and
>>>>>> apparently that it only allowed for 'def' functions).
>>>>>>
>>>>>> As such, I never bothered to re-enable it. However, it does support
>>>>>> control flow with uninitialized slices, and will raise an error if it
>>>>>> is uninitialized. Do we want this behaviour (e.g. for consistency)?
>>>>>
>>>>>
>>>>>
>>>>> When in doubt, go for consistency. So +1 for that reason. I do believe
>>>>> that
>>>>> setting stuff to None is rather vital in Python.
>>>>>
>>>>> What I typically do is more like this:
>>>>>
>>>>> def f(double[:] input, double[:] out=None):
>>>>>     if out is None:
>>>>>         out = np.empty_like(input)
>>>>>     ...
>>>>>
>>>>> Having to use another variable name is a bit of a pain. (Come on -- do
>>>>> you
>>>>> use "a" in real code? What do you actually call "the other obj"? I
>>>>> sometimes
>>>>> end up with "out_" and so on, but it creates smelly code quite quickly.)
>>>>
>>>>
>>>> No, it was just a contrived example.
>>>>
>>>>> It's easy to segfault with cdef classes anyway, so decent nonechecking
>>>>> should be implemented at some point, and then memoryviews would use the
>>>>> same
>>>>> mechanisms. Java has decent null-checking...
>>>>>
>>>>
>>>> The problem with none checking is that it has to occur at every point.
>>>
>>>
>>> Well, using control flow analysis etc. it doesn't really. E.g.,
>>>
>>> for i in range(a.shape[0]):
>>>     print i
>>>     a[i] *= 3
>>>
>>> can be unrolled and none-checks inserted as
>>>
>>> print 0
>>> if a is None: raise ....
>>> a[0] *= 3
>>> for i in range(1, a.shape[0]):
>>>     print i
>>>     a[i] *= 3 # no need for none-check
>>>
>>> It's very similar to what you'd want to do to pull boundschecking out of the
>>> loop...
>>>
>>
>> Oh, definitely. Both optimizations may not always be possible to do,
>> though. The optimization (for boundschecking) is easier for prange()
>> than range(), as you can immediately raise an exception as the
>> exceptional condition may be issued at any iteration.  What do you do
>> with bounds checking when some accesses are in-bound, and some are
>> out-of-bound? Do you immediately raise the exception? Are we fine with
>> aborting (like Fortran compilers do when you ask them for bounds
>> checking)? And how do you detect that the code doesn't already raise
>> an exception or break out of the loop itself to prevent the
>> out-of-bound access? (Unless no exceptions are propagating and no
>> break/return is used, but exceptions are so very common).
>>
>>>> With initialized slices the control flow knows when the slices are
>>>> initialized, or when they might not be (and it can raise a
>>>> compile-time or runtime error, instead of a segfault if you're lucky).
>>>> I'm fine with implementing the behaviour, I just always left it at the
>>>> bottom of my todo list.
>>>
>>>
>>> Wasn't saying you should do it, just checking.
>>>
>>> I'm still not sure about this. I think what I'd really like is
>>>
>>>   a) Stop cdef classes from being None as well
>>>
>>>   b) Sort-of deprecate cdef in favor of cast/assertion type statements that
>>> help the type inferences:
>>>
>>> def f(arr):
>>>     if arr is None:
>>>         arr = ...
>>>     arr = int[:](arr) # equivalent to "cdef int[:] arr = arr", but
>>>                       # acts as statement, with a specific point
>>>                       # for the none-check
>>>     ...
>>>
>>> or even:
>>>
>>> def f(arr):
>>>     if arr is None:
>>>         return 'foo'
>>>     else:
>>>         arr = int[:](arr) # takes effect *here*, does none-check
>>>         ...
>>>     # arr still typed as int[:] here
>>>
>>> If we can make this work well enough with control flow analysis I'd never
>>> cdef declare local vars again :-)
>>
>> Hm, what about the following?
>>
>> def f(arr):
>>     if arr is None:
>>         return 'foo'
>>
>>     cdef int[:] arr # arr may not be None
>
> The above would work in general, until the declaration is lexically
> encountered, the object is typed as object.

This was actually going to be my first proposal :-) That would finally 
define how "cdef" inside of if-statements etc. behave too (simply use 
control flow analysis and treat it like a statement).

But I like int[:] as a way of making it pure Python syntax compatible as 
well. Perhaps the two are orthogonal -- a) make variable declaration a 
statement, b) make cython.int[:](x) do, essentially, a cdef declaration, 
for Python compatability.

Dag


More information about the cython-devel mailing list