[Numpy-discussion] Robust Sorting of Points
Charles R Harris
charlesr.harris at gmail.com
Mon Oct 28 00:16:37 EDT 2013
On Sun, Oct 27, 2013 at 10:13 PM, Charles R Harris <
charlesr.harris at gmail.com> wrote:
>
>
>
> On Sun, Oct 27, 2013 at 12:28 PM, Freddie Witherden <freddie at witherden.org
> > wrote:
>
>> Hi all,
>>
>> This is a question which has been bugging me for a while. I have an (N,
>> 3) array where N ~ 16 of points. These points are all unique and
>> separated by a reasonable distance.
>>
>> I wish to sort these points into a canonical order in a fashion which is
>> robust against small perturbations. In other words changing any
>> component of any of the points by an epsilon ~ 1e-12 should not affect
>> the resulting sorted order.
>>
>> Considering a direct application of np.lexsort:
>>
>> In [6]: my_array = np.array([[-0.5, 0, 2**0.5],
>> [0.5, 0, 2**0.5 - 1e-15]])
>>
>> In [7]: my_array[np.lexsort(my_array.T)]
>> Out[7]: array([[ 0.5 , 0. , 1.41421356],
>> [-0.5 , 0. , 1.41421356]])
>>
>> however, if the small 1e-15 perturbation is removed the order changes to
>> the 'natural' ordering. Hence, np.lexsort is out.
>>
>> Rounding the array before sorting is not suitable either; just because
>> (a - b) < epsilon does not mean that np.around(a, decimals=x) ==
>> np.around(b, decimals=b).
>>
>> I am therefore looking for an efficient (= within a factor of 10 of
>> np.lexsort) solution to the problem. I've looked at writing my own
>> comparison function cmp(x, y) which looks at the next dimension if
>> abs(x[i] - y[i]) < epsilon however using this with sorted is thousands
>> of times slower. Given that I have well over 100,000 of these arrays
>> this is nuisance.
>>
>> My other idea is to therefore find a means of quickly replacing all
>> numbers within 10*epsilon of a given number in an array with that
>> number. This should permit the application of np.lexsort in order to
>> obtain the desired ordering (which is what I'm interesting in).
>> However, I am yet to figure out how to do this efficiently.
>>
>> Before I throw in the towel and drop down to C are there any other neat
>> tricks I am missing?
>>
>>
> Sort them as strings?
>
> In [17]: a = np.random.random((5,3))
>
> In [18]: a
> Out[18]:
> array([[ 0.64734085, 0.71582772, 0.82743219],
> [ 0.92769057, 0.46880266, 0.73836167],
> [ 0.11904745, 0.56834934, 0.16144849],
> [ 0.59186013, 0.90698496, 0.56950572],
> [ 0.7317681 , 0.93495724, 0.19217244]])
>
> In [19]: a.view('S24').sort(0)
>
> In [20]: a
> Out[20]:
> array([[ 0.7317681 , 0.93495724, 0.19217244],
> [ 0.64734085, 0.71582772, 0.82743219],
> [ 0.59186013, 0.90698496, 0.56950572],
> [ 0.92769057, 0.46880266, 0.73836167],
> [ 0.11904745, 0.56834934, 0.16144849]])
>
>
nvm, that won't work.
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20131027/ea84e214/attachment.html>
More information about the NumPy-Discussion
mailing list