[Twisted-Python] cBanana diffs

I've made a couple of local changes to cBanana to look at improving the performance of it some. The diff is available from: http://day.cubik.org/~bruce/spread.diff The change from the malloc/memcpy/free sequence to a realloc should be pretty clear. The remainder of the changes are intended to let us allocate the python list objects in the correct size (since we know that) and to then use PyList_SET_ITEM() rather than PyList_Append(). In theory, that should help, in practice, it didn't much on simple tests. One thing that was taking a while in simple runs was the handling of LONGINT and LONGNEG values (which call back into Python rather than using C code). I've not yet optimized that at all as I don't yet have a good enough understanding. I'll also run this new cBanana.c under valgrind or Purify this week to make sure that I didn't introduce any memory leaks. Comments and suggestions are welcome. - Bruce

Bruce Mitchener wrote:
I've made a couple of local changes to cBanana to look at improving the performance of it some.
Great! Got any numbers showing the speed difference?
Well, in general it's probably a good idea to avoid using long ints, since they will be slow no matter what and support in other languages may be iffy.

Itamar Shtull-Trauring wrote:
Nothing reliable. I was just testing with doc/examples/pbbenchserver.py and pbbenchclient.py and looking at how many calls/second were being made. This week, I'm planning on putting together a quick Banana bench and then I'll be able to test it directly and with more predictable loads to exercise the parts that I'm changing. :) That'll let me get reliable and useful numbers and do some targeted profiling as well I hope. Those fixes were just things that I'd noticed when reading through the source.
Yep. I was originally unaware of what they were in Python. :) Still learning... - Bruce

Bruce Mitchener wrote:
So, I wrote this crappy program that works to test decoding from banana: http://day.cubik.org/~bruce/bananabench.py 10000 iterations of decoding the banana-encoded form of: [1, 2, [3, 4], [30.5, 40.2], 5, ["six", "seven", ["eight", 9]], [10], []] had these results: Pure Python: 22.56 seconds CVS cBanana: 1.15 seconds My cBanana: 0.98 seconds Now, that data is list heavy, so it is particularly suited to enjoy the benefits of my patch. But, given the sorts of data that I know we pass around at work (not in Twisted), our data is typically pretty list heavy. Glyph said that PB stuff is usually pretty list-heavy as well. So, that looks to be a gain. I ran it under valgrind and that didn't seem to turn up any leaks. Cheers, - Bruce

Bruce Mitchener wrote:
These changes plus new ones are now in CVS. Current numbers from my bananabench look like: Pure Python: Encode took 11.9482729435 seconds Decode took 22.5815860033 seconds Old cBanana: Encode took 0.707735061646 seconds Decode took 1.09367489815 seconds Current cBanana: Encode took 0.633662939072 seconds Decode took 0.930390954018 seconds I'm not sure why the old cBanana dropped from 1.15 seconds to 1.09 seconds, but I do know that the additional changes to cBanana.c were responsible for the drop from 0.98 to 0.93 seconds. I'll finish up work on bananabench.py and check it in within a couple of days. Cheers, - Bruce

Bruce Mitchener wrote:
I've made a couple of local changes to cBanana to look at improving the performance of it some.
Great! Got any numbers showing the speed difference?
Well, in general it's probably a good idea to avoid using long ints, since they will be slow no matter what and support in other languages may be iffy.

Itamar Shtull-Trauring wrote:
Nothing reliable. I was just testing with doc/examples/pbbenchserver.py and pbbenchclient.py and looking at how many calls/second were being made. This week, I'm planning on putting together a quick Banana bench and then I'll be able to test it directly and with more predictable loads to exercise the parts that I'm changing. :) That'll let me get reliable and useful numbers and do some targeted profiling as well I hope. Those fixes were just things that I'd noticed when reading through the source.
Yep. I was originally unaware of what they were in Python. :) Still learning... - Bruce

Bruce Mitchener wrote:
So, I wrote this crappy program that works to test decoding from banana: http://day.cubik.org/~bruce/bananabench.py 10000 iterations of decoding the banana-encoded form of: [1, 2, [3, 4], [30.5, 40.2], 5, ["six", "seven", ["eight", 9]], [10], []] had these results: Pure Python: 22.56 seconds CVS cBanana: 1.15 seconds My cBanana: 0.98 seconds Now, that data is list heavy, so it is particularly suited to enjoy the benefits of my patch. But, given the sorts of data that I know we pass around at work (not in Twisted), our data is typically pretty list heavy. Glyph said that PB stuff is usually pretty list-heavy as well. So, that looks to be a gain. I ran it under valgrind and that didn't seem to turn up any leaks. Cheers, - Bruce

Bruce Mitchener wrote:
These changes plus new ones are now in CVS. Current numbers from my bananabench look like: Pure Python: Encode took 11.9482729435 seconds Decode took 22.5815860033 seconds Old cBanana: Encode took 0.707735061646 seconds Decode took 1.09367489815 seconds Current cBanana: Encode took 0.633662939072 seconds Decode took 0.930390954018 seconds I'm not sure why the old cBanana dropped from 1.15 seconds to 1.09 seconds, but I do know that the additional changes to cBanana.c were responsible for the drop from 0.98 to 0.93 seconds. I'll finish up work on bananabench.py and check it in within a couple of days. Cheers, - Bruce
participants (2)
-
Bruce Mitchener
-
Itamar Shtull-Trauring