Re: [SciPy-user] SciPy-user Digest, Vol 37, Issue 6
That's OK William. I gave it some more thought, and I don't think there's any sparse storage scheme which would allow me to do constant or log time slicing over two axes. The results from slicing the first axis would always have to be sorted, or individually tested, giving on average a complexity of O(n*log(n)) for the one axis sort and O(k*(log(n) + n/2)) for the lookups, where n is the number of intervals, and k is the number of stab inquiries. Since both k and n are high for my case, I've decided to go the traditional route and use an interval tree, which has a complexity of O(n*log(n)) for pre-processing, and only O(k * log(n)) for the lookups. Thanks anyway, Brendan ------------------------------ Message: 5 Date: Fri, 8 Sep 2006 07:52:14 +0200 From: "William Hunter" <willemjagter@gmail.com> Subject: Re: [SciPy-user] Efficient submatrix of a sparse matrix To: "SciPy Users List" <scipy-user@scipy.org> Message-ID: <8b3894bc0609072252k6f5cdee5p367695c3dd96371f@mail.gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Brendan; It occured to me after I sent the mail that you're probably not into solving sparse systems, but actually just slicing sparse matrices... If so, my previous post is obviously not of much meaning to you. -- WH On 07/09/06, Brendan Simons <brendansimons@yahoo.ca> wrote:
Hi all,
I have a neat little problem for which I think sparse matrices are the answer. I need to store a bunch of overlapping "intervals", (a, b pairs for which any a <= val < b crosses that interval) in a manner which makes "stabbing inquiries" (which intervals cross a specified value) easy. The canonical way to to this is with interval trees (a good summary here: http://w3.jouy.inra.fr/unites/miaj/public/vigneron/ cs4235/l5cs4235.pdf), however I think I can simplify things as follows:
1) perform an argsort on the interval a and b endpoints. This gives the position of each interval in an "order" matrix M, where each row and each column contains only one interval, and intervals are sorted in rows by start point, and in columns by endpoint
2) for a "stab" point x, do a binary search to find the row i and column j where x could be inserted into M and maintain its ordering. The submatrix of M[:i, j:] would contains all those intervals which start before x, and end after x.
Since the order matrix M is mostly empty it would make sense to use a sparse matrix storage scheme, however the only one in scipy.sparse that allows 2d slicing is the dok_matrix, and the slicing algorithm is O(n), which is much too expensive for my purposes (I have to do thousands of stabbing inquiries on a space containing thousands of intervals).
Is there no more efficient algorithm for 2d slicing of a sparse matrix?
-Brendan -- Brendan Simons __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
participants (1)
-
Brendan Simons