[Cython] Out of order side effects of argument evaluation in function calls (ticket #654)

Stefan Behnel stefan_ml at behnel.de
Wed Mar 16 15:53:35 CET 2011


Robert Bradshaw, 11.03.2011 19:33:
> On Fri, Mar 11, 2011 at 9:36 AM, Stefan Behnel<stefan_ml at behnel.de>  wrote:
>> Stefan Behnel, 11.03.2011 15:08:
>>>
>>> Vitja Makarov, 11.03.2011 15:04:
>>>>
>>>> 2011/3/11 Stefan Behnel:
>>>>>
>>>>> Personally, I think it would be nice to keep up Python's semantics, but
>>>>> when
>>>>> I implemented this I broke quite some code in Sage (you may have noticed
>>>>> that the sage-build project in Hudson has been red for a while). There
>>>>> are
>>>>> things in C and especially in C++ that cannot be easily copied into a
>>>>> temporary variable in order to make sure they are evaluated before the
>>>>> following arguments. This is not a problem for Python function calls
>>>>> where
>>>>> all arguments end up being copied (and often converted) anyway. It is a
>>>>> problem for C function calls, though.
>>>>
>>>>> f(g(a), a.x, h(a))
>>>>
>>>> Why could not this be translated into:
>>>>
>>>> tmp1 = g(a)
>>>> tmp2 = a.x
>>>> tmp3 = h(a)
>>>>
>>>> f(tmp1, tmp2, tmp3)
>>>
>>> See above.
>>
>> To be a little clearer here, it's a problem in C for example with struct
>> values. Copying them by value into a temp variable can be expensive,
>> potentially twice as expensive as simply passing them into the function
>> normally.
>
> Yep, and some types (e.g. array types) can't be assigned to at all.
> FWIW, the issues with Sage is that many libraries use the "stack
> allocated, pass-by-reference" trick
>
>      typedef foo_struct foo_t[1]
>
> but we have declared them to be of type "void*" because we don't care
> about or want to muck with the internals. Cleaning this up is
> something we should do, but is taking a while, and aside from that it
> makes Cython even more dependent on correct type declarations (and
> backwards incompatible in this regard). Sage is a great test for the
> buildbot, so keeping it red for so long is not a good thing either.
>
>> Not sure what kind of additional devilry C++ provides here, but I'd expect
>> that object values can exhibit bizarre behaviour when being assigned. Maybe
>> others can enlighten me here.
>
> Yes, C++ allows overloading of the assignment operator, so assigning
> may lead to arbitrary code (as well as probably an expensive copy, as
> with structs, and structs in C++ are really just classes with a
> different visibility).
>
>> I have no idea how many cases there actually are that we can't handle or
>> that may lead to a performance degradation when using temps, but the problem
>> is that they exist at all.
>
> Note that this applies not just to function arguments, but a host of
> other places, e.g. the order of evaluation of "f() + g()" in C is
> unspecified. Fleshing this out completely will lead to a lot more
> temps and verbose C code. And then we'd just cross our fingers that
> the C compiler was able to optimize all these temps away (though still
> possibly producing inferior code if it were able to switch order of
> execution).
>
> Whatever we do, we need a flag/directive. Perhaps there's a way to
> guarantee correct ordering for all valid Python code

Plain Python code should never suffer from the original problem (with the 
obvious exception of cpdef functions), but I also see how type inference 
can affect this guarantee for the evaluation order in expressions.

I agree with Greg, though, that it has a code smell if (multiple) 
subexpressions in an expression have side-effects, especially when they are 
impacted by the evaluation order. That means that the average innocent 
future maintainer could break the code by manually rearranging the 
expression in a way that's perfectly valid mathematically, e.g. as a 
readability fix.


> even in the
> presence of (function and variable) type inference, but allow some
> leeway for explicitly declared C functions.

I'm actually leaning towards not guaranteeing the order of execution if C 
doesn't do it either. If this is really required, it's easy to work around 
for users, but it's severely hard to fix for Cython in all cases, and the 
gain is truly small. After all, we'd only make it easier for users to write 
bad code.

Stefan


More information about the cython-devel mailing list