[Tutor] Re: processing lines and polygons

Michael Dunn Michael.Dunn at mpi.nl
Fri Oct 22 16:11:11 CEST 2004


> Message: 1
> Date: Fri, 22 Oct 2004 06:03:28 -0400
> From: Kent Johnson <kent_johnson at skillsoft.com>
> Subject: Re: [Tutor] Re: Tutor Digest, Vol 8, Issue 67
>
> A few tweaks for your current algorithm:
>
> - Don't do the sqrt in get_distance(), you don't need the actual distance.
> Change Sequence.distance to 0.005**2 and everything will work the same.

Of course! Lovely!

> - Keep the dict keys as floats so you don't have to convert every time in
> get_distance(), the conversion is likely to be expensive.

Good idea too.

> - This is gross abuse of a dictionary! You are getting the keys as a list
> and searching the list!!
>          if s.start in endpoints.keys():
>              endpoints[s.start].append(s)
>
> instead use
>          if s.start in endpoints:
>              endpoints[s.start].append(s)
>
> or my preference which only looks up s.start once:
>        try:
>              endpoints[s.start].append(s)
>        except KeyError: pass

Oh dear. I'm a professional lexicographer, and I've certainly never been
accused of dictionary abuse before. That really smarts ;). The try/except
version is much nicer.

> Not sure why you have this. s.start will only be in endpoints once:
> >         if self.points[0] == self.points[-1]:
> >             try: endpoints.pop(s.start)
> >             except KeyError: pass # I don't understand why this exc. is
> > called

If the first point in a sequence is the same as the last point in a sequence
then the sequence is a closed polygon, and I thought I wouldn't need to join
anything else to it. Now I'm forcing sequences to close up when their endpoints
are within the snap tolerance I should probably not do this any more.

> Message: 5
> Date: Fri, 22 Oct 2004 07:38:54 -0400
> From: Kent Johnson <kent_johnson at skillsoft.com>
> Subject: Re: [Tutor] Re: Tutor Digest, Vol 8, Issue 67
>
> A couple more quick ideas:
>
> You may not even need to take the sum of squares in get_distance().
> max(abs(ax-bx), abs(ay-by)) would probably work fine for your purposes,
> since you are just trying to find points that are visually close.

I get it--this would find the closest point on the horizontal or the vertical
axis rather than measuring the diagonal. Part of me doesn't like that as an
approximation, since the data points we are dealing with are real places in
real space, and it seems disrespectful. But thinking about it I'd be amazed if
this ever caused a problem, since the pairs of points I want to join are all
likely to be long way from other pairs that need joining. So I'm convinced.

> You could keep a sorted list of (x, y) coordinates and use that for finding
> the nearest neighbor. Use the bisect module to find where to insert a new
> point in the list. To find the closest point search both ways in the list
> looking for points that are close. If you use the distance metric above you
> can stop searching when abs(ax-bx) is greater than your "closeness" cutoff.

I've never looked at the bisect module before, but from the docs I can see what
you mean. I'll have a play with it and see what I can do.

> HTH
A great deal, TYVM,

Michael



More information about the Tutor mailing list