[Tutor] Re: processing lines and polygons

Michael Dunn Michael.Dunn at mpi.nl
Wed Oct 20 16:48:20 CEST 2004


Thanks Kent,

that's solved my problem nicely (although I now realize I've got a new one, see
below). Here's my code -- I've learnt python mostly from lurking on this list,
and although I find myself using python a lot, I've never shown any of my code
to anyone before, so I'd also appreciate general comments:

#!/usr/bin/env python
"""Take matlab/splus or mapgen vector and join broken lines"""
import re, sys

class Sequence:
    trace = False # append list of contributing input sequences to output
    output_format = "matlab" # matlab, mapgen, mif (MapInfo Interchange Format)

    def __init__(self, segment):
        self.points = [tuple(point.split("\t")) for point\
                in segment.strip().split("\n")]
        self.start = self.points[0]
        self.end = self.points[-1]
        self.history = []
        return

    def append(self, other):
        # only called if self is already in the output_sequences list
        self.history.extend(other.history)
        # 1. do the join
        if self.start == other.start or self.end == other.end:
            other.points.reverse()
            other.start, other.end = other.end, other.start
        if self.start == other.end:
            self.points = other.points[:-1] + self.points
        elif self.end == other.start:
            self.points = self.points + other.points[1:]
        # 2. update endpoints dictionary
        if self.points[0] == self.points[-1]:
            # it's become a closed polygon
            endpoints.pop(s.start)
        elif self.start != self.points[0]:
            endpoints.pop(self.start)
            self.start = self.points[0]
            endpoints[self.start] = self
        elif self.end != self.points[-1]:
            endpoints.pop(self.end)
            self.end = self.points[-1]
            endpoints[self.end] = self
        return

    def __str__(self):
        if Sequence.output_format == "mif":
            divider = "PLine\n   %s" % len(self.points)
        elif Sequence.output_format == "matlab":
            divider = "NA NA"
        elif Sequence.output_format == "mapgen":
            divider = "# -b"
        if Sequence.trace:
            divider = "%s <-- %s" % (divider, str(self.history))
        s = [divider]
        for point in self.points:
            s.append("%s\t%s" % point)
        return "\n".join(s)

endpoints = {}
output_sequences = []
mif_header = """version 3
Coordsys earth projection 1,104
Columns 1
  SEG_ID integer
Data
"""

if __name__ == "__main__":

    f = file(sys.argv[1], "rU")
    data = f.read()
    f.close()

    print >> sys.stderr, "Processing sequences"
    input_sequences = re.split("NA NA|# -b", data.strip())
    for i, s in enumerate(input_sequences[1:]):
        s = Sequence(s)
        s.history.append(i)
        if s.start in endpoints.keys():
            endpoints[s.start].append(s)
        elif s.end in endpoints.keys():
            endpoints[s.end].append(s)
        else:
            endpoints[s.start] = s
            endpoints[s.end] = s
            output_sequences.append(s)

    print mif_header
    Sequence.output_format = "mif"
    for s in output_sequences:
        print s
    print >> sys.stderr, "Original: %s sequences\nJoined: %s sequences" %\
            (len(input_sequences), len(output_sequences))

    # examined output for missing input (there was none)
    # L = []
    # for s in output_sequences:
    #     L.extend(s.history)
    # for i in range(len(input_sequences[1:])):
    #     if i not in L:
    #         print i,

I've rebooted into windows (yawn) and tested the output in MapInfo, and it's a
lot better, but there are still bits of coastline not joined up. When I zoom in
to the bits where the join should be, it looks like the points don't actually
match. The separation distance is much less than the resolution of the map
however, so I should be able to join them up automatically too. I would have to
change the lines testing whether s.start, s.end are in endpoints.keys() to
something testing whether they are in a radius of the endpoints. Or is there a
simpler way...?

Just thinking out loud, it occurs to me I could give the Sequence object
additional attributes snap_to_grid_start and snap_to_grid_end and use these to
determine whether lines should be joined, but join them using their original,
more precise, values. Hmmm. More tomorrow.

Big thanks to all the contributors on the tutor list! It's an amazing resource.

Michael


More information about the Tutor mailing list