[Cython] Out of order side effects of argument evaluation in function calls (ticket #654)

Thu Mar 17 05:05:01 CET 2011

On Wed, Mar 16, 2011 at 7:53 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Robert Bradshaw, 11.03.2011 19:33:
>>
>> On Fri, Mar 11, 2011 at 9:36 AM, Stefan Behnel<stefan_ml at behnel.de>
>>  wrote:
>>>
>>> Stefan Behnel, 11.03.2011 15:08:
>>>>
>>>> Vitja Makarov, 11.03.2011 15:04:
>>>>>
>>>>> 2011/3/11 Stefan Behnel:
>>>>>>
>>>>>> Personally, I think it would be nice to keep up Python's semantics,
>>>>>> but
>>>>>> when
>>>>>> I implemented this I broke quite some code in Sage (you may have
>>>>>> noticed
>>>>>> that the sage-build project in Hudson has been red for a while). There
>>>>>> are
>>>>>> things in C and especially in C++ that cannot be easily copied into a
>>>>>> temporary variable in order to make sure they are evaluated before the
>>>>>> following arguments. This is not a problem for Python function calls
>>>>>> where
>>>>>> all arguments end up being copied (and often converted) anyway. It is
>>>>>> a
>>>>>> problem for C function calls, though.
>>>>>
>>>>>> f(g(a), a.x, h(a))
>>>>>
>>>>> Why could not this be translated into:
>>>>>
>>>>> tmp1 = g(a)
>>>>> tmp2 = a.x
>>>>> tmp3 = h(a)
>>>>>
>>>>> f(tmp1, tmp2, tmp3)
>>>>
>>>> See above.
>>>
>>> To be a little clearer here, it's a problem in C for example with struct
>>> values. Copying them by value into a temp variable can be expensive,
>>> potentially twice as expensive as simply passing them into the function
>>> normally.
>>
>> Yep, and some types (e.g. array types) can't be assigned to at all.
>> FWIW, the issues with Sage is that many libraries use the "stack
>> allocated, pass-by-reference" trick
>>
>>     typedef foo_struct foo_t[1]
>>
>> but we have declared them to be of type "void*" because we don't care
>> about or want to muck with the internals. Cleaning this up is
>> something we should do, but is taking a while, and aside from that it
>> makes Cython even more dependent on correct type declarations (and
>> backwards incompatible in this regard). Sage is a great test for the
>> buildbot, so keeping it red for so long is not a good thing either.
>>
>>> Not sure what kind of additional devilry C++ provides here, but I'd
>>> expect
>>> that object values can exhibit bizarre behaviour when being assigned.
>>> Maybe
>>> others can enlighten me here.
>>
>> Yes, C++ allows overloading of the assignment operator, so assigning
>> may lead to arbitrary code (as well as probably an expensive copy, as
>> with structs, and structs in C++ are really just classes with a
>> different visibility).
>>
>>> I have no idea how many cases there actually are that we can't handle or
>>> that may lead to a performance degradation when using temps, but the
>>> problem
>>> is that they exist at all.
>>
>> Note that this applies not just to function arguments, but a host of
>> other places, e.g. the order of evaluation of "f() + g()" in C is
>> unspecified. Fleshing this out completely will lead to a lot more
>> temps and verbose C code. And then we'd just cross our fingers that
>> the C compiler was able to optimize all these temps away (though still
>> possibly producing inferior code if it were able to switch order of
>> execution).
>>
>> Whatever we do, we need a flag/directive. Perhaps there's a way to
>> guarantee correct ordering for all valid Python code
>
> Plain Python code should never suffer from the original problem (with the
> obvious exception of cpdef functions), but I also see how type inference can
> affect this guarantee for the evaluation order in expressions.

In particular, we need to be careful if we ever automatically cpdef a
function (as part of a future optimization).

> I agree with Greg, though, that it has a code smell if (multiple)
> subexpressions in an expression have side-effects, especially when they are
> impacted by the evaluation order. That means that the average innocent
> future maintainer could break the code by manually rearranging the
> expression in a way that's perfectly valid mathematically, e.g. as a
> readability fix.
>
>
>> even in the
>> presence of (function and variable) type inference, but allow some
>> leeway for explicitly declared C functions.
>
> I'm actually leaning towards not guaranteeing the order of execution if C
> doesn't do it either. If this is really required, it's easy to work around
> for users, but it's severely hard to fix for Cython in all cases, and the
> gain is truly small. After all, we'd only make it easier for users to write
> bad code.

Yep. Lets keep the code in for the above case.

- Robert