Re: [Python-de] dijkstra algorithmus

19 Feb 2012

      sackbard wrote:
...
Hallo Python Community,
bin neu in der Liste und hab natürlich gleich eine Frage. Ich versuche
den dijkstra algorithmus in Python zu implementieren. Gegeben sind zwei
Dateien: die network.txt die die distanzen der einzelnen nodes angibt
und die route.txt die mir anfangs und endpunkt ausgibt.
Beispiel network.txt (in diesem Beispiel nodes A-G):
0,2,4,1,6,0,0
2,0,0,0,5,0,0
4,0,0,0,0,5,0
1,0,0,0,1,1,0
6,5,0,1,0,5,5
0,0,5,1,5,0,0
0,0,0,0,5,0,0
Beispiel route.txt:
B>F
Am ende soll eine neue txt datei erstellt werden welche die route und
die total path length enthält.
Bis auf den letzten Schritt funktioniert auch alles soweit (python 3.1.1).
Ich bin nur noch nicht dahinter gekommen wie ich am Ende am besten die
Route und die Total path length berechne. Hatte überlegt den weg vom
letzten zum ersten node einfach zurückzugehen aber das scheint mir
umständlich und kompliziert.
So kompliziert auch wieder nicht:

currentNode = endNode
route = []
while True:
    route.append(currentNode)
    if currentNode == startNode:
        break
    currentNode = nodeTable[currentNode].previous
route.reverse()
print(">".join(chr(node+65) for node in route))

Wenn ich dein Script richtig verstehe, hast du die "total path length" doch 
schon als Node.distFromSource gespeichert.

Noch ein paar Empfehlungen zu deinem Code:

Du mischst Code-Schnipsel und Funktionsdefinitionen wild durcheinander. Ich 
empfehle, die Funktionen an den Anfang zu stellen und die "Schnipsel" alle 
zusammen in eine Funktion zu packen. Damit vermeidest du unnötige globale 
Variablen.
...
Hier das script soweit ich das hinbekommen habe:
import sys
invalid_node = -1
class Node:
    previous = invalid_node
    distFromSource = sys.maxsize
    visited = False
Das geht hier zwar, weil alle Klassenattribute "immutable" sind, aber eine 
__init__()-Methode wär sauberer. Ich würde auch Namen und Index des Knotens 
speichern.
...
#print ("\nReading the file into a list.")
txtfile = open("network.txt", "r")
numberOfNodes = 0
network = [] # array containing array of values for each line
In neuerem Python-Code verwendet man 

with open(...) as file:
    # use it

statt

file = open(...)
# use it
file.close()
...
# reading file line by line
for line in txtfile:
    # increment nodeCount
    numberOfNodes += 1
# remove end chars
    line = line.replace("\n","")
    line = line.replace("\r","")
# split by comma
    chars = line.split(",")
    ints = []
# convert string to int
    for char in chars:
        ints.append(int(char))
# append the list of ints to the array
    network.append(ints)
Als list-comprehension:
network = [
    [int(s) for s in line.split(",")]
    for line in txtfile
]

Zeilenenden vorher zu entfernen is überflüssig, da int() "whitespace" 
ignoriert:
...
...
...
int("   123\n")
123
...
# read start (dist value = 0) and end node
route = open("route.txt", "r")
for line in route:
    nodes = line.split(">")
    startNode = nodes[0]
    endNode = nodes[1]
# get the ascii value and minus 65
startNode =(ord(startNode) - 65)
endNode =(ord(endNode) - 65)
#print ("Start node: " + str(startNode))
#print ("End node: " + str(endNode))
# set currentNode to startNode
currentNode = startNode
# create another 2d array with previous node, distFromSource and visited
# use sys.maxint for distFromSource
nodeTable = [] # array containing nodeTable visited/unvisited,prevNode
and distFromSrc
i = 0
while i <= numberOfNodes:
Du erstellst numberOfNodes+1 Node-Instanzen. Ist das Absicht?
...
nodeTable.append(Node())
    i += 1
Indices werden in Python selten manuell verwaltet.
Entweder:

for i in range(len(network)):
    nodeTable.append(Node())

oder:

for i, distances in enumerate(network):
    nodeTable.append(Node())
...
# set distFromSource of startNode to 0
nodeTable[startNode].distFromSource = 0
nodeTable[startNode].visited = True
#write a function to find the neighbour of current node
def findNeighbours(currentNode):
    # find nearest neighbour of currentNode
    currentRow = network[currentNode]
    currentNeighbours = []
    i = 0
for distance in currentRow:
        #get back the index number / position if distance = 0
        if distance != 0:
            #save the location
            currentNeighbours.append(i)
        i += 1
    return currentNeighbours
#print ("\nCurrent Neighbours (of node " + str(currentNode) + "): ")
currentNeighbours = findNeighbours(currentNode)
#print (currentNeighbours)
# go through currentNeighbour and see if they have been visited or not
# set dustFromSource and previousNode for unvisited
...
def checkNeighbours():
Besser: explizite Argumente:

def checkNeighbours(currentNeighbours):
...
for neighbour in currentNeighbours:
        if nodeTable[neighbour].visited == True:
            pass
            #print ("True")
        else:
            # calculate their tentative distances from starting node
            # add distFromSource from currentNode with distance between
currentNode and neighbour
            # check if less than current
            newDistance = nodeTable[currentNode].distFromSource +
network[currentNode][neighbour]
if (nodeTable[neighbour].distFromSource > newDistance):
                nodeTable[neighbour].distFromSource = newDistance
                nodeTable[neighbour].previous = currentNode
checkNeighbours()
#printNodeTb()
# go through all distances of unvisited, find lowest and mark as visited
def setNewCurrentNode():
    global currentNode
Besser:

def setNewCurrentNode(currentNode):
    ...
    return currentNode

Aufrufen dann mit:

currentNode = setNewCurrentNode(currentNode)
...
index = 0
    currentDist = sys.maxsize
for node in nodeTable:
        if node.visited == False and node.distFromSource < currentDist:
            currentNode = index
            currentDist = node.distFromSource
        index+=1
nodeTable[currentNode].visited = True
    print ("\nNew Current Node: " + str(currentNode))
setNewCurrentNode()
Es sollte möglich sein, das Skript so umzuformulieren, dass 
checkNeighbours() und setNewCurrentNode() nur an einer Stelle aufgerufen 
werden (hab es nicht ausprobiert).
...
#total path length
tpl = 0
Wenn du statt tpl total_path_length als Namen der Variablen verwendest, 
ersparst du dir den Kommentar und erhöhst gleichzeitig die Lesbarkeit des 
Codes.
...
f = open("09023429spf.txt", "w")
f.write(str(chr(startNode+65)))
# check if destination node reached
while currentNode != endNode:
#print ("\nCurrent Neighbours (of node " + str(currentNode) + "): ")
    currentNeighbours = findNeighbours(currentNode)
    #print (currentNeighbours)
checkNeighbours()
    f.write(str(chr(currentNode+65)))
setNewCurrentNode()
    #add total path length
    tpl += nodeTable[nodeTable[currentNode].previous].distFromSource
f.write(str(chr(endNode+65)))
str() ist redundant, da chr(...) bereits einen String zurückliefert.
...
f.write(" " + str(tpl))
f.close()
print ("\nTotal path length: " + str(tpl) + "\n")
for node in nodeTable:
    print (str(node.previous) + " | " + str(node.distFromSource) + " | "
+ str(node.visited))
Das wird mit str.format() etwas übersichtlicher, z. B. 

print("{} | {} | {}".format(node.previous, node.distFromSource, 
node.visited))

oder

print("{0.previous} | {0.distFromSource} | {0.visited}".format(node))

Re: [Python-de] dijkstra algorithmus

Peter Otten