[sapug] Stateless Iteration

Chris Foote chris at inetd.com.au
Sun Jun 11 14:47:40 CEST 2006


On Sun, 11 Jun 2006, Daryl Tester wrote:

> I'm puttering around with XML-RPC, which is all
> relatively straight forward and such, and have come
> across an issue that I've struck before in the past
> and never adequately resolved, and that is the
> burning issue of today - "Scalable APIs".

I do like ithe simplicity of XMLRPC, but recently I've become a real
fan of Pyro:
 	http://pyro.sourceforge.net
It has lots of advantages over XMLRPC, particularly speed and TCP
connection reuse, but has just one disadvantage - Python specificity.

> I want to provide an access method for a large list
> of objects (say, Customers, or Accounts, or records
> of some type - they will be uniform types).  If I
> provide a method that says "return list", then it
> may crunch away on a a very large dataset and occupy:
>
> 1) a large amount of bandwidth traversing the network
> (although if the data is required, the data is
> required), so this reason is a bit of a furfy.
>
> B) a large amount of time to transfer the result
> (variation on 1, but more important).
>
> III) a large amount of memory to store the result
> on the client.
>
> (also an inability to stick to a consistent numbering
> format, but that's the fault of the author's).
>
> B & III are the ones I'm most concerned about.

I've found the Judy arrays are good for memory usage.  There's a Python
interface to Judy called PyJudy:
 	http://www.dalkescientific.com/Python/PyJudy.html

If you can't fit it into RAM, then maybe a fast disk based storage system
such as Metakit might be useable:
 	http://www.equi4.com/metakit/python.html

> If the results were driving (for example) a GUI listbox,
> I'd like the results to start populating that immediately
> rather than waiting for the entire result set to be
> returned.  If I'm driving an export or conversion process
> then I don't need to store the entire result set on the
> client, I can process each record as it comes in.

A good way might be to have the drop down populate itself from a blocking
queue, where the data in the queue is populated from another thread
getting the results back from the remote procedure call and adding each
one to the queue.  see:
 	http://www.python.org/doc/current/lib/module-Queue.html
&	http://www.python.org/doc/current/lib/QueueObjects.html

I'm using threads & Queues to implement an RPC logging server which
allows the client to continue getting its work done whilst logging
happens asynchronously.  Your problem might be similar.

> Cool, so some form of iterator would probably be ideal.

You could get the yield statement to return the value of a Queue's
get(block=True) method.

Hope that helps,

-- 
Chris Foote <chris at inetd.com.au>
Inetd Pty Ltd T/A HostExpress
Web:   http://www.hostexpress.com.au
Blog:  http://www.hostexpress.com.au/drupal/chris
Phone: (08) 8410 4566


More information about the sapug mailing list