[AstroPy] Astropy and large VOTable files

Andrew Hearin andrew.hearin at yale.edu
Mon May 18 14:09:44 EDT 2015


Being able to read large data in chunks, make cuts on the chunks, and return a table of rows that pass the cuts is a pretty common data mining task that I think would be good to include in Astropy. I’m happy to (re-)raise a GitHub issue for this purpose, and contribute some code, but first: Jennifer, this is the functionality you are describing, right? If so: Mike, do you see any fundamental obstacles with this? 



On May 18, 2015, at 2:00 PM, Michael Droettboom <mdroe at stsci.edu> wrote:

> Thanks for the question.
> 
> Unfortunately, it will read the entire file into memory each time.  It does read it in as a Numpy array, so the memory used should generally be less than the space on disk, however, depending on the content.
> 
> XML doesn't really support the kind of slicing that FITS (or another binary format) can, because you can't know how big something is (or even what it is!) without parsing the whole file.  That said, given the constraint of the file format, minimal memory usage is one of the main design features of astropy.io.votable, so I'd recommend trying it on large files and seeing how it goes.  It shouldn't ever take significantly more memory than a binary array of data, i.e. the same as the equivalent FITS file loaded entirely into memory.  
> 
> Cheers,
> Mike
> 
> On 05/17/2015 10:11 AM, Jennifer Baldwin wrote:
>> Hi all,
>> 
>> I was trying to find an answer to this but could not. I am wondering if parse_single_table will attempt to read an entire VOTable file? Or if it will operate the same way as for FITS files so that when you slice the returned data array, it only loads the part it needs into memory? I'm concerned with how it will           perform with extremely large xml files, but could not find a direct answer anywhere in the documentation.
>> 
>> Thanks!
>> 
>> 
>> _______________________________________________
>> AstroPy mailing list
>> AstroPy at scipy.org
>> http://mail.scipy.org/mailman/listinfo/astropy
> 
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__mail.scipy.org_mailman_listinfo_astropy&d=AwICAg&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=AHkQ8HPUDwzl0x62ybAnwN_OEebPRGDtcjUPBcnLYw4&m=fqrZPrNFrzwqmHSxKJ-shiCsIXJN8_SWmuwg5yOr9sA&s=m6R7fy7bDIllNOJ0BaVKj5GdN1j87_QtxcNSxOty56I&e= 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/astropy/attachments/20150518/99981483/attachment.html>


More information about the AstroPy mailing list