Graph algorithms - DFS, generators callbacks, and optimisation

Sat Nov 29 05:54:49 EST 2003

I'm trying to check a graph (represented in the usual Python adjacency
list format) for loops. This is dead simple - you use a depth-first
search, and look out for "back edges" (edges from a vertex "back" to
one you've already seen).

I already have a DFS algorithm which generates each edge in turn. It's
great for what it does, but it ignores back edges.

def DFS(graph, start, seen = None):
    if not seen:
	seen = {start: 1}
    for v in graph[start]:
	if v not in seen:
	    seen[v] = 1
	    yield start, v
	    if v in graph:
		for e in DFS(graph, v, seen):
		    yield e

I could code a variation on this, just for this one problem, but that
sort of cut and paste coding offends me (and DFS is *just* fiddly
enough that I'm not confident of getting it right each time I
reimplement it). So what I'd rather do is enhance my existing code to
be a bit more general.

I have a C++ reference which implements a highly general DFS
algorithm, using the visitor pattern (pass a visitor object to the
function, and various callback methods are called at points in the
algorithm - there are discover_vertex, discover_edge, back_edge, etc
callbacks).

I could code something like this (the pseudocode for the algorithm is
almost executable in Python!) but it doesn't feel particularly
"pythonic". Having to write a visitor class for each use case feels
unwieldy (as well as being perceptibly slower than the generator
implementation). Having the various callbacks as arguments (pass a
callable or None) isn't much faster or cleaner.

Can anybody think of a good way of making the DFS code above a little 
more generic, without losing the efficient and simple (from the
caller's POV) approach of the current version?

Paul.
-- 
This signature intentionally left blank