[Cython] AddTraceback() slows down generators
vitja.makarov at gmail.com
Sat Jan 28 20:58:13 CET 2012
2012/1/28 mark florisson <markflorisson88 at gmail.com>:
> On 28 January 2012 19:41, Vitja Makarov <vitja.makarov at gmail.com> wrote:
>> 2012/1/28 Stefan Behnel <stefan_ml at behnel.de>:
>>> Stefan Behnel, 27.01.2012 09:02:
>>>> any exception *propagation* is
>>>> still substantially slower than necessary, and that's a general issue.
>>> Here's a general take on a code object cache for exception propagation.
>>> When I raise an exception in test code that propagates through a Python
>>> call hierarchy of four functions before being caught, the cache gives me
>>> something like a 2x speedup in total. Not bad. When I do the same for cdef
>>> functions, it's more like 4-5x.
>>> The main idea is to cache the objects in a reallocable C array and bisect
>>> into it based on the C code "__LINE__" of the exception, which should be
>>> unique enough for a given module.
>>> It's a global cache that doesn't limit the lifetime of code objects (well,
>>> up to the lifetime of the module, obviously). I don't know if that's a
>>> problem because the number of code objects is only bounded by the number of
>>> exception origination points in the C source code, which is usually quite
>>> large. However, only a tiny fraction of those will ever raise or propagate
>>> an exception in practice, so the real number of cached code objects will be
>>> substantially smaller.
>>> Maybe thorough test suites with lots of failure testing would notice a
>>> difference in memory consumption, even though a single code objects isn't
>>> all that large either...
>>> What do you think?
>> We already have --no-c-in-traceback flag that disables C line numbers
>> in traceback.
>> What's about enabling it by default?
> I'm quite attached to that feature actually :), it would be pretty
> annoying to disable that flag every time. And what would disabling
> that option gain, as the current code still formats the filename and
> function name.
It's rather useful for developers or debugging. Most of the people
don't need it.
Here is simple benchmark:
# upstream/master: 6.38ms
# upstream/master (no-c-in-traceback): 3.07ms
# scoder/master: 1.31ms
cdef int i
for i in range(10000):
Stefan's branch wins but:
- there is only one item in the cache and it's always hit
- we can still avoid calling PyString_FromString() making function
name and source file name a python const (I've tried it and I get
More information about the cython-devel