[Numpy-discussion] Robust Sorting of Points
Charles R Harris
charlesr.harris at gmail.com
Mon Oct 28 00:13:26 EDT 2013
On Sun, Oct 27, 2013 at 12:28 PM, Freddie Witherden
<freddie at witherden.org>wrote:
> Hi all,
>
> This is a question which has been bugging me for a while. I have an (N,
> 3) array where N ~ 16 of points. These points are all unique and
> separated by a reasonable distance.
>
> I wish to sort these points into a canonical order in a fashion which is
> robust against small perturbations. In other words changing any
> component of any of the points by an epsilon ~ 1e-12 should not affect
> the resulting sorted order.
>
> Considering a direct application of np.lexsort:
>
> In [6]: my_array = np.array([[-0.5, 0, 2**0.5],
> [0.5, 0, 2**0.5 - 1e-15]])
>
> In [7]: my_array[np.lexsort(my_array.T)]
> Out[7]: array([[ 0.5 , 0. , 1.41421356],
> [-0.5 , 0. , 1.41421356]])
>
> however, if the small 1e-15 perturbation is removed the order changes to
> the 'natural' ordering. Hence, np.lexsort is out.
>
> Rounding the array before sorting is not suitable either; just because
> (a - b) < epsilon does not mean that np.around(a, decimals=x) ==
> np.around(b, decimals=b).
>
> I am therefore looking for an efficient (= within a factor of 10 of
> np.lexsort) solution to the problem. I've looked at writing my own
> comparison function cmp(x, y) which looks at the next dimension if
> abs(x[i] - y[i]) < epsilon however using this with sorted is thousands
> of times slower. Given that I have well over 100,000 of these arrays
> this is nuisance.
>
> My other idea is to therefore find a means of quickly replacing all
> numbers within 10*epsilon of a given number in an array with that
> number. This should permit the application of np.lexsort in order to
> obtain the desired ordering (which is what I'm interesting in).
> However, I am yet to figure out how to do this efficiently.
>
> Before I throw in the towel and drop down to C are there any other neat
> tricks I am missing?
>
>
Sort them as strings?
In [17]: a = np.random.random((5,3))
In [18]: a
Out[18]:
array([[ 0.64734085, 0.71582772, 0.82743219],
[ 0.92769057, 0.46880266, 0.73836167],
[ 0.11904745, 0.56834934, 0.16144849],
[ 0.59186013, 0.90698496, 0.56950572],
[ 0.7317681 , 0.93495724, 0.19217244]])
In [19]: a.view('S24').sort(0)
In [20]: a
Out[20]:
array([[ 0.7317681 , 0.93495724, 0.19217244],
[ 0.64734085, 0.71582772, 0.82743219],
[ 0.59186013, 0.90698496, 0.56950572],
[ 0.92769057, 0.46880266, 0.73836167],
[ 0.11904745, 0.56834934, 0.16144849]])
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20131027/2d358f72/attachment.html>
More information about the NumPy-Discussion
mailing list