[Cython] memoryview slices can't be None?

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Fri Feb 3 18:53:02 CET 2012


On 02/03/2012 12:09 AM, mark florisson wrote:
> On 2 February 2012 21:38, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no>  wrote:
>> On 02/02/2012 10:16 PM, mark florisson wrote:
>>>
>>> On 2 February 2012 12:19, Dag Sverre Seljebotn
>>> <d.s.seljebotn at astro.uio.no>    wrote:
>>>>
>>>> I just realized that
>>>>
>>>> cdef int[:] a = None
>>>>
>>>> raises an exception; even though I'd argue that 'a' is of the "reference"
>>>> kind of type where Cython usually allow None (i.e., "cdef MyClass b =
>>>> None"
>>>> is allowed even if type(None) is NoneType). Is this a bug or not, and is
>>>> it
>>>> possible to do something about it?
>>>>
>>>> Dag Sverre
>>>> _______________________________________________
>>>> cython-devel mailing list
>>>> cython-devel at python.org
>>>> http://mail.python.org/mailman/listinfo/cython-devel
>>>
>>>
>>> Yeah I disabled that quite early. It was supposed to be working but
>>> gave a lot of trouble in cases (segfaults, mainly). At the time I was
>>> trying to get rid of all the segfaults and get the basic functionality
>>> working, so I disabled it. Personally, I have never liked how things
>>
>>
>> Well, you can segfault quite easily with
>>
>> cdef MyClass a = None
>> print a.field
>>
>> so it doesn't make sense to slices different from cdef classes IMO.
>>
>>
>>> can be None unchecked. I personally prefer to write
>>>
>>> cdef foo(obj=None):
>>>      cdef int[:] a
>>>      if obj is None:
>>>          obj = ...
>>>      a = obj
>>>
>>> Often you forget to write 'not None' when declaring the parameter (and
>>> apparently that it only allowed for 'def' functions).
>>>
>>> As such, I never bothered to re-enable it. However, it does support
>>> control flow with uninitialized slices, and will raise an error if it
>>> is uninitialized. Do we want this behaviour (e.g. for consistency)?
>>
>>
>> When in doubt, go for consistency. So +1 for that reason. I do believe that
>> setting stuff to None is rather vital in Python.
>>
>> What I typically do is more like this:
>>
>> def f(double[:] input, double[:] out=None):
>>     if out is None:
>>         out = np.empty_like(input)
>>     ...
>>
>> Having to use another variable name is a bit of a pain. (Come on -- do you
>> use "a" in real code? What do you actually call "the other obj"? I sometimes
>> end up with "out_" and so on, but it creates smelly code quite quickly.)
>
> No, it was just a contrived example.
>
>> It's easy to segfault with cdef classes anyway, so decent nonechecking
>> should be implemented at some point, and then memoryviews would use the same
>> mechanisms. Java has decent null-checking...
>>
>
> The problem with none checking is that it has to occur at every point.

Well, using control flow analysis etc. it doesn't really. E.g.,

for i in range(a.shape[0]):
     print i
     a[i] *= 3

can be unrolled and none-checks inserted as

print 0
if a is None: raise ....
a[0] *= 3
for i in range(1, a.shape[0]):
     print i
     a[i] *= 3 # no need for none-check

It's very similar to what you'd want to do to pull boundschecking out of 
the loop...

> With initialized slices the control flow knows when the slices are
> initialized, or when they might not be (and it can raise a
> compile-time or runtime error, instead of a segfault if you're lucky).
> I'm fine with implementing the behaviour, I just always left it at the
> bottom of my todo list.

Wasn't saying you should do it, just checking.

I'm still not sure about this. I think what I'd really like is

  a) Stop cdef classes from being None as well

  b) Sort-of deprecate cdef in favor of cast/assertion type statements 
that help the type inferences:

def f(arr):
     if arr is None:
         arr = ...
     arr = int[:](arr) # equivalent to "cdef int[:] arr = arr", but
                       # acts as statement, with a specific point
                       # for the none-check
     ...

or even:

def f(arr):
     if arr is None:
         return 'foo'
     else:
         arr = int[:](arr) # takes effect *here*, does none-check
         ...
     # arr still typed as int[:] here

If we can make this work well enough with control flow analysis I'd 
never cdef declare local vars again :-)

Dag


More information about the cython-devel mailing list