2009/10/13 <twisted-python-request@twistedmatrix.com>

Send Twisted-Python mailing list submissions to
twisted-python@twistedmatrix.com

To subscribe or unsubscribe via the World Wide Web, visit
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
or, via email, send a message with subject or body 'help' to
twisted-python-request@twistedmatrix.com

You can reach the person managing the list at
twisted-python-owner@twistedmatrix.com

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Twisted-Python digest..."

Today's Topics:

1. Re: Twisted Python vs. "Blocking" Python: Weird performance
on small operations. (Reza Lotun)
2. Re: Twisted-Python Digest, Vol 67, Issue 22 (Dirk Moors)

----------------------------------------------------------------------

Message: 1
Date: Tue, 13 Oct 2009 15:04:06 +0100
From: Reza Lotun <rlotun@gmail.com>
Subject: Re: [Twisted-Python] Twisted Python vs. "Blocking" Python:
Weird performance on small operations.
To: Twisted general discussion <twisted-python@twistedmatrix.com>
Message-ID:
<95bb10690910130704o7c0ff2besf00dcf5918990dcf@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

Hi Dirk,

I took a look at your code sample and got the async benchmark to run
with the following values:
*** Starting Asynchronous Benchmarks.

-> Asynchronous Benchmark (1 runs) Completed in 0.000181913375854 seconds.
-> Asynchronous Benchmark (10 runs) Completed in 0.000736951828003 seconds.
-> Asynchronous Benchmark (100 runs) Completed in 0.00641012191772 seconds.
-> Asynchronous Benchmark (1000 runs) Completed in 0.0741751194 seconds.
-> Asynchronous Benchmark (10000 runs) Completed in 0.675071001053 seconds.
-> Asynchronous Benchmark (100000 runs) Completed in 7.29738497734 seconds.

*** Asynchronous Benchmarks Completed in 8.16032314301 seconds.

Which, though still quite a bit slower than the synchronous version,
is still much better than the 40 sec. mark that you were experiencing.
My modified version simply returned defer.succeed from your aync
block-compute functions.

i.e. Instead of your initial example:
def int2binAsync(anInteger):
def packStruct(i):
#Packs an integer, result is 4 bytes
return struct.pack("i", i)

d = defer.Deferred()
d.addCallback(packStruct)

reactor.callLater(0,
d.callback,
anInteger)

return d

my version does:

def int2binAsync(anInteger):
return defer.succeed(struct.pack('i', anInteger))

A few things to note in general however:
1) Twisted shines for block I/O operations - i.e. networking. A
compute intesive process will not necessarily yield any gains in
performance by using Twisted since the Python GIL exists (a global
lock).

2) If you are doing computations that use a C module (unforunately
struct pre 2.6 I believe doesn't use a C module), there may be a
chance that the C module releases the GIL, allowing you to do those
computations in a thread. In this case you'd be better off using
deferToThread as suggested earlier.

3) There is some (usually minimal but it exists) overhead to using
Twisted. Instead of computing a bunch of stuff serially and returning
your answer as in your sync example, you're wrapping everything up in
deferreds and starting a reactor - it's definitely going to be a bit
slower than the pure synchronous version for this case.

Hope that makes sense.

Cheers,
Reza

--
Reza Lotun
mobile: +44 (0)7521 310 763
email: rlotun@gmail.com
work: reza@tweetdeck.com
twitter: @rlotun

------------------------------

Message: 2
Date: Tue, 13 Oct 2009 16:18:35 +0200
From: Dirk Moors <dirkmoors@gmail.com>
Subject: Re: [Twisted-Python] Twisted-Python Digest, Vol 67, Issue 22
To: twisted-python@twistedmatrix.com
Message-ID:
<cf75a1410910130718m53645515oc65f0890366a12f2@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hello Valeriy,

I tried the thing you suggested, and I attached the (updated) code.
Unfortunatly, the new code was even slower, producing the following results:

*** Starting Asynchronous Benchmarks. (Using Twisted, with
"deferred-decorator")
-> Asynchronous Benchmark (1 runs) Completed in 56.0279998779 seconds.
-> Asynchronous Benchmark (10 runs) Completed in 56.0130000114 seconds.
-> Asynchronous Benchmark (100 runs) Completed in 56.010999918 seconds.
-> Asynchronous Benchmark (1000 runs) Completed in 56.0410001278 seconds.
-> Asynchronous Benchmark (10000 runs) Completed in 56.3069999218 seconds.
-> Asynchronous Benchmark (100000 runs) Completed in 58.8910000324
seconds.
*** Asynchronous Benchmarks Completed in 59.4659998417 seconds.

I suspect that this would me more inefficient because with the deferToThread
function in place, every single operation will be executed in its own
thread, which means:
(1 x 2) + (10 x 2) + (100 x 2) + (1000 x 2) + (10000 x 2) + (100000 x 2)
threads....which is...a lot.

Maybe the problem lies in the way I test the code? I understand that using
the asynchronous testcode this way (generating the deferreds using a
FOR-loop), a lot of deferreds are generated before the reactor starts
calling the deferred-callbacks.....would there be another, better way to
test the code?
The reason I need to now which one is faster (async vs sync functions) is
because I need to decide on whetehr or not I should re-evaluate the code I
just recently finished building.

Any other ideas maybe?

Thanks in advance,
Dirk

________________________________________________________________________________________________________________________________________________________
> Message: 3
> Date: Tue, 13 Oct 2009 09:41:19 -0400
> From: Valeriy Pogrebitskiy <vpogrebi@verizon.net>
> Subject: Re: [Twisted-Python] Twisted Python vs. "Blocking" Python:
> Weird performance on small operations.
> To: Twisted general discussion <twisted-python@twistedmatrix.com>
> Message-ID: <EDB2B354-B25D-4A98-AC9D-B9745CA6C3AB@verizon.net>
> Content-Type: text/plain; charset="us-ascii"
>
> Dirk,
>
> Using deferred directly in your bin2intAsync() may be somewhat less
> efficient than some other way described in Recipe 439358: [Twisted]
> From blocking functions to deferred functions
>
> recipe (http://code.activestate.com/recipes/439358/)
>
> You would get same effect (asynchronous execution) - but potentially
> more efficiently - by just decorating your synchronous methods as:
>
> from twisted.internet.threads import deferToThread
> deferred = deferToThread.__get__
> ....
> @deferred
> def int2binAsync(anInteger):
> #Packs an integer, result is 4 bytes
> return struct.pack("i", anInteger)
>
> @deferred
> def bin2intAsync(aBin):
> #Unpacks a bytestring into an integer
> return struct.unpack("i", aBin)[0]
>
>
>
>
> Kind regards,
>
> Valeriy Pogrebitskiy
> vpogrebi@verizon.net
>
>
>
>
> On Oct 13, 2009, at 9:18 AM, Dirk Moors wrote:
>
> > Hello Everyone!
> >
> > My name is Dirk Moors, and since 4 years now, I've been involved in
> > developing a cloud computing platform, using Python as the
> > programming language. A year ago I discovered Twisted Python, and it
> > got me very interested, upto the point where I made the decision to
> > convert our platform (in progress) to a Twisted platform. One year
> > later I'm still very enthousiastic about the overal performance and
> > stability, but last week I encountered something I did't expect;
> >
> > It appeared that it was less efficient to run small "atomic"
> > operations in different deferred-callbacks, when compared to running
> > these "atomic" operations together in "blocking" mode. Am I doing
> > something wrong here?
> >
> > To prove the problem to myself, I created the following example
> > (Full source- and test code is attached):
> >
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > import struct
> >
> > def int2binAsync(anInteger):
> > def packStruct(i):
> > #Packs an integer, result is 4 bytes
> > return struct.pack("i", i)
> >
> > d = defer.Deferred()
> > d.addCallback(packStruct)
> >
> > reactor.callLater(0,
> > d.callback,
> > anInteger)
> >
> > return d
> >
> > def bin2intAsync(aBin):
> > def unpackStruct(p):
> > #Unpacks a bytestring into an integer
> > return struct.unpack("i", p)[0]
> >
> > d = defer.Deferred()
> > d.addCallback(unpackStruct)
> >
> > reactor.callLater(0,
> > d.callback,
> > aBin)
> > return d
> >
> > def int2binSync(anInteger):
> > #Packs an integer, result is 4 bytes
> > return struct.pack("i", anInteger)
> >
> > def bin2intSync(aBin):
> > #Unpacks a bytestring into an integer
> > return struct.unpack("i", aBin)[0]
> >
> >
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > While running the testcode I got the following results:
> >
> > (1 run = converting an integer to a byte string, converting that
> > byte string back to an integer, and finally checking whether that
> > last integer is the same as the input integer.)
> >
> > *** Starting Synchronous Benchmarks. (No Twisted => "blocking" code)
> > -> Synchronous Benchmark (1 runs) Completed in 0.0 seconds.
> > -> Synchronous Benchmark (10 runs) Completed in 0.0 seconds.
> > -> Synchronous Benchmark (100 runs) Completed in 0.0 seconds.
> > -> Synchronous Benchmark (1000 runs) Completed in 0.00399994850159
> > seconds.
> > -> Synchronous Benchmark (10000 runs) Completed in 0.0369999408722
> > seconds.
> > -> Synchronous Benchmark (100000 runs) Completed in 0.362999916077
> > seconds.
> > *** Synchronous Benchmarks Completed in 0.406000137329 seconds.
> >
> > *** Starting Asynchronous Benchmarks . (Twisted => "non-blocking"
> > code)
> > -> Asynchronous Benchmark (1 runs) Completed in 34.5090000629
> > seconds.
> > -> Asynchronous Benchmark (10 runs) Completed in 34.5099999905
> > seconds.
> > -> Asynchronous Benchmark (100 runs) Completed in 34.5130000114
> > seconds.
> > -> Asynchronous Benchmark (1000 runs) Completed in 34.5859999657
> > seconds.
> > -> Asynchronous Benchmark (10000 runs) Completed in 35.2829999924
> > seconds.
> > -> Asynchronous Benchmark (100000 runs) Completed in 41.492000103
> > seconds.
> > *** Asynchronous Benchmarks Completed in 42.1460001469 seconds.
> >
> > Am I really seeing factor 100x??
> >
> > I really hope that I made a huge reasoning error here but I just
> > can't find it. If my results are correct then I really need to go
> > and check my entire cloud platform for the places where I decided to
> > split functions into atomic operations while thinking that it would
> > actually improve the performance while on the contrary it did the
> > opposit.
> >
> > I personaly suspect that I lose my cpu-cycles to the reactor
> > scheduling the deferred-callbacks. Would that assumption make any
> > sense?
> > The part where I need these conversion functions is in marshalling/
> > protocol reading and writing throughout the cloud platform, which
> > implies that these functions will be called constantly so I need
> > them to be superfast. I always though I had to split the entire
> > marshalling process into small atomic (deferred-callback) functions
> > to be efficient, but these figures tell me otherwise.
> >
> > I really hope someone can help me out here.
> >
> > Thanks in advance,
> > Best regards,
> > Dirk Moors
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > <twistedbenchmark.py>_______________________________________________
> > Twisted-Python mailing list
> > Twisted-Python@twistedmatrix.com
> > http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://twistedmatrix.com/pipermail/twisted-python/attachments/20091013/e9ae2546/attachment.htm
>
> ------------------------------
>
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python@twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>
>
> End of Twisted-Python Digest, Vol 67, Issue 22
> **********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://twistedmatrix.com/pipermail/twisted-python/attachments/20091013/357ffe0c/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twistedbenchmark.py
Type: application/octet-stream
Size: 7269 bytes
Desc: not available
Url : http://twistedmatrix.com/pipermail/twisted-python/attachments/20091013/357ffe0c/attachment.obj

------------------------------

_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

End of Twisted-Python Digest, Vol 67, Issue 23
**********************************************