[Cython] CF and finally clause

Wed May 25 07:21:26 CEST 2011

Robert Bradshaw, 25.05.2011 01:30:
> On Tue, May 24, 2011 at 2:17 PM, Carl Witty wrote:
>> On Tue, May 24, 2011 at 2:04 PM, Stefan Behnel wrote:
>>> I'm not so opposed to this proposal. I have been (idly and unfoundedly)
>>> wondering basically forever if the current way try-finally is implemented is
>>> actually a good one. I can accept a performance penalty for the exception
>>> case in both try-finally and try-except, but the normal case in try-finally
>>> should run as fast as possible. I'm not sure the C compiler can always
>>> detect what that "normal" case is in order to properly optimise for it.
>>
>> Evidently Java compilers duplicate the finally block (or, actually,
>> triplicate it):
>>
>> http://cliffhacks.blogspot.com/2008/02/java-6-tryfinally-compilation-without.html
>
> Interesting...
>
> I don't like the idea of copying code all over, Stefan makes some good points.

Note that we generate slightly different code for the good and bad cases 
anyway. Only the exception case stores away and restores the exception, the 
other ones don't need to do that.

I also dislike code duplication in general, but finally clauses tend to be 
really short. Most of the time, it's just a couple of lines of cleanup 
code, often just a single function/method call ("file.close()"). In a 
Cython context, it's even better because many of the finally clauses will 
just do C cleanup ("free()"), without major Python operations that would 
bloat the generated code with C-API calls or optimistic code paths. For 
example, I can't remember having ever seen for-loops or tuple unpacking in 
a finally clause, which are the things (apart from Python argument 
unpacking) that Cython generates the longest code for.

All that a finally clause really gives you is to make sure the body gets 
started. If it raises an exception itself, you're on your own again. So I'd 
rather expect to find try-except(-pass) inside of a finally clause than a 
nested try-finally, which makes recursive code explosion rather unlikely.

I just looked through lxml's sources and found that out of 36 finally 
clauses, 26 (more than 2/3) are one-liners like "thing.close()", 
"free(mem)" or "self.attr = None". Only two clauses are longer than three 
lines because they do different things in Py2 and Py3 (so the C compiler 
will drop part of it), everything else is basically just a permutation of 
the above three statements or an if-test before cleanup.

A quick skip through the stdlib seems to second that: lots of one liners, 
some if-tests, very few other cases. Longer blocks are truly rare, such as 
in the dummy threading module, which has a lengthy "if+cleanup" finally 
clause at the module scope, or the subprocess module, which has one 4-step 
"if+cleanup" section (obviously in the Windows code ;) ). IMHO, not really 
worth bothering about.

I think that programmers tend to be rather aware of what belongs into a 
finally clause and what doesn't really need to go there, either because 
they know that this will get executed in all possible cases, so it should 
only be the strict intersection of the different code paths - or simply 
because "finally" seems too mystical to entrust it with larger code blocks.

Stefan