
Hi, I had some fun during the last days implementing low-level object freelist support in Cython and gave it a try in lxml. It's implemented as a class decorator, so all you have to do is to add a "@cython.freelist(8)" declaration to get an 8 item freelist for the class. Even relatively short freelists tend to bring a lot as they hit typical iteration use cases, for example. For benchmarking, I initially used "Element.attrib", which is a property that instantiates a new object on each request (not cached in order to avoid reference cycles). Most use cases of this class consist of a one-shot operation to read some XML attributes, so it's a classical "create and throw away" thing, the ideal candidate for a freelist. The code that I ran through timeit is this: Setup: 'import lxml.etree as et; el = et.Element("tag", a="1", b="2")' Benchmarked code: 'el.attrib' Initial timings: 10000000 loops, best of 3: 0.0856 usec per loop Replacing "_Attrib(element)" by "_Attrib.__new__(_Attrib, element)": 10000000 loops, best of 3: 0.0781 usec per loop Adding only the freelist decorator to the _Attrib class: 10000000 loops, best of 3: 0.0737 usec per loop Combining both "__new__()" and the freelist: 10000000 loops, best of 3: 0.0608 usec per loop That gives an overall speedup of almost 30%, just by changing two lines of code. The main Element proxy class in lxml is another case where it really makes sense to apply this decorator. It gets instantiated whenever Python access to an XML element is requested, e.g. during iteration. Here are the timings for iterating over a large XML document. Benchmarked code: 'list(islice(tree.iter(), 10000000, 10000001))' Initial timings: 100 loops, best of 3: 15.6 msec per loop Enabling the freelist: 100 loops, best of 3: 13.2 msec per loop That's 15% faster, even all the way through a non-trivial tree iteration operation. And with just one additional declaration in the code. Note that this change will require Cython 0.19, which isn't currently close to release (we just recently released 0.18). So this change may take a bit of time to make it into an lxml release. Stefan

Dirk Rothe, 24.02.2013 15:43:
Yes, it's a purely internal thing. I'll apply the patch as soon as I can reasonably add a dependency on Cython 0.19. New releases of lxml have traditionally used the most recent Cython version anyway, so that shouldn't take long. As it stands, the freelist support doesn't currently apply to subtypes, so lxml.objectify or other types of custom element classes can't benefit. But I may find a way around that before it gets released. Stefan

Am 24.02.2013, 17:09 Uhr, schrieb Stefan Behnel <stefan_ml@behnel.de>:
Does your last note apply to stuff like ElementDefaultClassLookup(element=MyElement)? We are using that to add some convenient functions (clone(), delete(), pprint(), subelement(), and xpath() with predefined NSs) to etree.Elements. --dirk

Stefan Behnel, 24.02.2013 22:54:
After looking into this some more, I don't think it makes much sense to support freelists for regular Python objects. It's hard enough to get working safely for arbitrary extension types and I may not even end up doing that. I'll have enough to do making sure the current freelist support is safe already. Stefan

Dirk Rothe, 24.02.2013 15:43:
Yes, it's a purely internal thing. I'll apply the patch as soon as I can reasonably add a dependency on Cython 0.19. New releases of lxml have traditionally used the most recent Cython version anyway, so that shouldn't take long. As it stands, the freelist support doesn't currently apply to subtypes, so lxml.objectify or other types of custom element classes can't benefit. But I may find a way around that before it gets released. Stefan

Am 24.02.2013, 17:09 Uhr, schrieb Stefan Behnel <stefan_ml@behnel.de>:
Does your last note apply to stuff like ElementDefaultClassLookup(element=MyElement)? We are using that to add some convenient functions (clone(), delete(), pprint(), subelement(), and xpath() with predefined NSs) to etree.Elements. --dirk

Dirk Rothe, 24.02.2013 20:13:
It should, eventually. I already extended it to all subtypes that have the same PyObject struct size, but including subtypes with different sizes will need a bit more work. BTW, you could sponsor this work, if you want to be sure it gets done. Stefan

Stefan Behnel, 24.02.2013 22:54:
After looking into this some more, I don't think it makes much sense to support freelists for regular Python objects. It's hard enough to get working safely for arbitrary extension types and I may not even end up doing that. I'll have enough to do making sure the current freelist support is safe already. Stefan
participants (2)
-
Dirk Rothe
-
Stefan Behnel