[Cython] Safer default exception handling with return type annotations?

Fri Sep 8 00:51:41 EDT 2017

On Wed, Sep 6, 2017 at 12:06 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Robert Bradshaw schrieb am 06.09.2017 um 08:28:
>> On Tue, Sep 5, 2017 at 10:44 PM, Stefan Behnel wrote:
>>> Robert Bradshaw schrieb am 06.09.2017 um 07:21:
>>>> I'm not a huge fan of behaving differently depending on what syntax
>>>> was used to annotate the return type--I'd rather they be 100% aliases
>>>> of each other.
>>>
>>> Regarding this bit - I already chose to implement some differences for
>>> annotation typing. Mainly, if you say
>>>
>>>     def f(x: int) -> float:
>>>         return x
>>>
>>> then the (plain "def") function will actually be typed as "double
>>> f(object)"., assuming that you probably meant the Python types and not the
>>> C types. If you want the C types "int" and "float", you have to use either
>>> of these:
>>>
>>>     def f1(x: cython.int) -> cython.float:
>>>         return x
>>>
>>>     cpdef float f2(int x):
>>>         return x
>>>
>>> That is because the main use case of signature annotations is Python code
>>> compatibility, so I tried to change the semantics as little as possible
>>> from what the code would be expected to do in Python.
>>
>> What about
>>
>> def f(x: float) -> int
>>   return x * 2
>>
>> would that throw an error if x was, say, a str?
>
> It would raise an exception on input, but there would not currently be an
> error on return.
>
>
>> I think float -> c double but int -> python object will be surprising.
>
> I agree. There are two reasons: we don't currently use the int/long Python
> types anywhere in Cython (which could obviously be changed), and in Python
> 2, this would exclude "long" objects, which is most likely not intended.
> So, "int" would probably best refer to "int object or long object" in
> Python 2, which definitely complicates things.
>
> Besides, how many functions can really deal with both "int" and "str" input
> completely? Your example above is actually a good one, because it would
> currently return a float object, not "int". But, since it would do the same
> thing when run in Python, that's probably acceptable. It would need an
> explicit conversion in both cases, in which case Cython wouldn't need to
> enforce the type anymore.
>
> If "int" is used as input type (or variable declaration), then enforcing it
> somehow on assignment would be much more relevant.

Yes, this was my point.

    def f(x: float)
        return x * 2
    f("str")

would throw an error but

    def f(x: int)
        return x * 2
    f("str")

would not, which seems inconsistent. Also,

    def f(x: List[float])
        return x * 2
    f("str")  # error
    f(["str"])  # error?
    f([1.5]) # ok
    f([1]) # ok?

    from some_module import X
    def g(x: X)
        return x * 2
    g("str") # error?

There is a whole can of worms here once one starts "enforcing" these
types, partially. I think we need a very clear list of what exactly is
supported. Perhaps we should start out with just supporting cython.*
and perhaps cimported classes.

> Maybe we should reconsider this whole business when we drop support for
> Python 2.7. ;)

I don't think we'll be able to get rid of 2.7 for quite some time...

>> I also worry a bit
>> about x: float being enforced but x: List[float] not being so.
>
> Interpreting "List[float]" as Cython type "list" would definitely be nice,
> but note that it would disallow subtypes. In Python, it does not.
>
> I think the right way to deal with that, eventually, will be optionally
> allowing subtypes also in Cython, and handling the distinction more at a
> case by case basis. Then you could declare a variable as "list" or
> "List[Any]", and enforce the exact type in the first case but not in the
> second.
>
> And we should definitely use "List[itemtype]" hints also in type inference
> for loops and indexing at some point.
>
>>> I think this type interpretation is a reasonable, use case driven
>>> difference to make. Thus my question if we should extend this to the
>>> exception declaration.
>>
>> I suppose you've already made a case for deviating...
>>
>> I guess I think it'd be nice to change the default universally, but
>> that's perhaps a bigger conversation.
>
> I think so, too. For this specific case, we can change the default without
> breaking backwards compatibility.
>
> I also added a new decorator "@exceptval(x, check=False)" for pure mode. If
> used without arguments as "@exceptval()" or even "@exceptval(check=False)",
> which seems more readable, users could still get the "write unraisable but
> don't propagate" behaviour, if they really need it.

+1. Let's make the value required, perhaps with None meaning no value
(as it'll never be used). Perhaps check should default to True as
well, as the overhead is small (it's an additional check only for
exceptions) but more conservative.