[python-ldap] ldap.OPT_DESC, async ops and paged search controls

Tue Jan 27 22:13:24 CET 2015

On 27/01/2015 20:52, Michael Ströder wrote:
> Mark R Bannister wrote:
>> The original design limits the number of threads in an attempt to be more
>> scalable.  There is a fixed number of workers that can each be responsible for
>> a larger number of LDAP connections.  This is by design.  If I launched a new
>> blocking thread for each LDAP connection, it would be easy to overload the
>> system with too many threads by sending in many different requests
>> simultaneously.
> I don't know what you're after.
>
> I thought you have many incoming requests and you want to handle them
> completely asynchronously placing incoming requests into a queue. You could
> send many LDAP requests to the server without blocking and dispatch the
> incoming responses to the original requests in the queue.
>
> Ciao, Michael.
>
Indeed, which is more-or-less how's it's done.  But you spoke of having 
a blocking thread reading results back from the directory server, and I 
don't see how that could be an improvement.

Here is a rough diagram of how it currently works.One half of the jigsaw 
puzzle is implemented by my Pyloom library 
(https://sourceforge.net/p/dbis/code/ci/default/tree/src/pyloom/__init__.py.in). 
One listen socket, some marshal threads that pick up incoming requests 
and dispatch to a separate group of worker threads:

             [listen socket]
        ____________|____________
        |           |           |
    [marshal]   [marshal]   [marshal]
    ____|____   ____|____   ____|____
    |   |   |   |   |   |   |   |   |
   [w] [w] [w] [w] [w] [w] [w] [w] [w]

The worker thread itself is implemented in the DBIS Server class 
(https://sourceforge.net/p/dbis/code/ci/default/tree/src/dbis/server.py) 
and has a queue of work, each which may have its own LDAP object (or a 
reference to another session that is currently looking up the 
information it needs):

   [w] +--> [session 1] ---> [LDAP object]
       |
       +--> [session 2] ---> [LDAP object]
       |
       +--> [session 3] ---> [waiting on another session]
       |
       ... etc.

As you can see, I may have many LDAP objects in different states, and 
it's all asynchronous, but importantly I have a fixed number of 
threads.  The only thing that is dynamic is the size of a worker's 
queue, i.e. the number of sessions a worker thread is actively managing 
may go up or down, to an uppper limit, but the total number of running 
threads stays the same.  Each worker relies on a select() call to wake 
it up, and there may be a number of things that wakes a worker up (new 
incoming request from marshal thread, new incoming data from running 
LDAP search operation, data now available in watched session, server 
shutdown message).

Now what I don't follow is your suggestion of using a blocking thread to 
read results back from an LDAP search operation.  For that to work, I 
would need a much greater number of threads running, one per LDAP 
object.  I don't see how that would improve anything, I can only see it 
making the server perform more poorly when under heavy load.  But maybe 
I'm missing something here ...

Best regards,
Mark.