[AstroPy] SQL-like queries against astropy.table objects

Aldcroft, Thomas aldcroft at head.cfa.harvard.edu
Mon Feb 10 12:52:11 EST 2014


On Sat, Feb 8, 2014 at 9:00 AM, Eric Jensen <ejensen1 at swarthmore.edu> wrote:

> Hi Brian,
>
> I think that your use case is actually pretty straightforward in an
> astropy table, since you just want to pull out a subset of the table.  It
> seems that what's tricky in the stackoverflow example you linked to is the
> JOIN and GROUPBY parts of the query, not the selection.
>
> You can easily select rows that meet multiple criteria, and use the
> resulting boolean array to create a new table with the subset of data you
> want.  Here's an example, not with your variables but just using a text
> file of interferometry complex visibilities I had around (U, V, W baseline
> lengths, real and imaginary parts of the visibility):
>
> In [32]:
>
> from astropy.table import Table, Column
>
> import numpy as np
>
> In [33]:
>
> t = Table.read('test_data.txt', format='ascii.commented_header')
>
> In [34]:
>
> t.colnames
>
> Out[34]:
>
> ['U', 'V', 'W', 'Real', 'Imag', 'Weight', 'Freq']
>
> In [35]:
>
> # Select only entries with long baselines in U and W, and Real amplitudes > 1 Jy:
>
> good_rows = np.logical_and(t['U'] > 100, t['V'] > 100, t['Real'] > 1)
>
> In [42]:
>
> # Make a new table from this selection, and see how many rows were in
>
> # new table vs. the original one:
>
> new_table = t[good_rows]
>
> new_table['U'].shape
>
>
> Out[42]:
>
> (120,)
>
> In [41]:
>
> t['U'].shape
>
> Out[41]:
>
> (4500,)
>
>
> There might be even better ways to do it (and there is almost certainly a
> cleaner way to get the number of rows in a table than calling 'shape' on
> one column), but this should give you the basic idea of the sort of
> selection that it seem like you're looking for.
>
> There's also a table-joining function:
>
>
> http://docs.astropy.org/en/latest/api/astropy.table.operations.join.html#astropy.table.operations.join
>
> For more complex operations, others who are more familiar with both
> astropy and SQL can chime in, but thought I'd offer what (little) I know on
> the selection part.
>

In addition to joins, astropy table has some support for group_by and table
concatenation (by rows or columns):

  http://astropy.readthedocs.org/en/latest/table/operations.html

For tables that can be manipulated entirely in memory this should be
reasonably efficient.  Simple selections with boolean arrays as shown
earlier are quite fast and efficient.

Cheers,
Tom



>
> Hope this helps,
>
> Eric
>
>
>
> On Feb 7, 2014, at 11:22 PM, Brian Kloppenborg wrote:
>
> Greetings,
>
> I have unfortunately fallen quite far behind on astropy development and
> my knowledge of the finer features of many of Python's libraries is
> limited because I do most of my work in C/C++.
>
> I have a use case in which I have a small (< 100k row) amount of static
> data which I need to both visualize and be able to easily manipulate
> (grouping, inclusion/exclusion based upon some criteria, application of
> some simple mathematics without altering the original data). I've used
> SQL extensively in the past and the queries which I would run are a
> perfect use case; however, visualizing the data with matplotlib (or
> similar) might be more efficiently accomplished from an astropy.table
> object. Is it possible to run an SQL-like query against an astropy.table
> object? If not with SQL-like syntax, how else could this be done?
>
> For example, say I have a table with the following columns (not the use
> case, but an easy demo):
> JD, mag, mag_error
>
> I could find all data points with an SNR > 100 within a date range by:
>     SELECT * from table WHERE (JD > value AND JD < value AND
> mag/mag_err > 100);
> quite quickly. Conversely, implementing something similar using numpy
> operations appears to be quite painful (e.g.
> http://stackoverflow.com/questions/7169240/moving-large-sql-query-to-numpy)
>
> unless this is already built-in to astropy.table objects.
>
> Kind regards,
> Brian Kloppenborg
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> http://mail.scipy.org/mailman/listinfo/astropy
>
>
>
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> http://mail.scipy.org/mailman/listinfo/astropy
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/astropy/attachments/20140210/4f24cf10/attachment.html>


More information about the AstroPy mailing list