On 12/28/19 6:24 PM, Tim Peters wrote:
[Larry]
Here is the original description of my problem, from the original email in
this thread.  I considered this an adequate explanation of my problem
at the time.

      
I do have a use case for this. In one project I maintain a "ready" list of
jobs; I need to iterate over it, but I also want fast lookup because I
soemtimes remove jobs when they're subsequently marked "not ready".
Which is a description not of "the problem", but of the operations you
want to perform.  It's the first time in my life I've heard of a
"ready list of jobs" where the programmer sometimes decides later "oh,
no - this one and that one aren't ready after all" ;-)


It's a lightweight abstract dependency graph.  Its nodes are opaque, only required to be hashable.  And it doesn't require that you give it all the nodes in strict dependency order.

When you add a node, you can also optionally specify dependencies, and those dependencies aren't required to be nodes that the graph has previously seen.  So you can add a node A that depends on B and C, without showing it B or C first.  When that happens it creates placeholder nodes for B and C, and naturally these nodes have no dependencies.  You can then later inform the graph that node B has dependencies on X Y and Z.

(My specific use case is a package manager.  Packages are identified with unique strings.  So the nodes I give the give the graph are simply those package names.  It's pretty common for me to get a package with dependencies on packages I haven't seen yet.  Designing the graph to create placeholders let me skip making two passes over the package list, pre-creating the package objects, etc etc.  This made the code simpler and presumably faster.)

The graph API maintains an externally-visible "ready" iterable of nodes.  And since B can go from ready to not-ready, it can get added to "ready" and subsequently removed.

Also, when a node is satisfied, you simply remove it from the graph.  If A depends on B and C, and they all represent jobs, and B and C have no dependencies, they'll be in the "ready" iterable.  You iterate over "ready", and execute those jobs, and assuming they are successful you then remove them from the graph.  When A's dependencies are all satisfied, it'll get added to the "ready" iterable.  And naturally when B and C were removed from the graph, they were removed from the "ready" iterable too.

Thus it's this "ready" collection that I want to a) be iterable, b) be stable, and c) support fast removal.  I add and remove nodes from that set a lot, which I realized would perturb the iteration order from run to run, etc etc etc, and here we are.


I grant you maybe it's a weird approach.  But after two false starts, where I was starting to boil the oceans with ABCs and "node is satisfied" APis and separate iterator objects, I had a re-think and hit on this super lightweight design.  I'm actually quite pleased with it--it's short, it has a minimal API, it's readable, it was easy to get right, and it solves my problem.


Happy new year,


/arry