[AstroPy] AstroPy Digest, Vol 81, Issue 12

Chris Beaumont beaumont at hawaii.edu
Thu Jun 20 12:50:52 EDT 2013


I thought I'd chime in on the pandas discussion :)

I'm starting to use pandas a bit more in my day-to-day work. The two
features most useful to me are:

1) Its file parsers are pretty robust and fast. I always try parsing CSV
with pandas first

2) For tables tables with lots of categorical data, the grouping
functionality is very nice. For example, calculations like "the mean age of
each spectral type of star in my catalog" are usually one liners like
df.groupby(['spectral_type']).age.mean. I spend a lot of time on the
"split-apply-combine" page on the pandas docs (
http://pandas.pydata.org/pandas-docs/stable/groupby.html).

I won't speculate about whether that's enough an asset to warrant a
dependency in astropy. I do agree that lots of other pandas features don't
translate as well into astronomy use.



On Thu, Jun 20, 2013 at 12:34 PM, Erik Tollerud <erik.tollerud at gmail.com>wrote:

> I'm of mixed minds about traits UI because once you know it you can make
> great GUIs with it, but I've spent a lot of time troubleshooting people's
> python installations to get traits to work.  That is, in general it can be
> tricky to get installed because of all the dependencies.  Maybe this has
> improved recently with Enthought's Canopy (or other new python distros),
> but that's been my past experience.
>
> More generally, the view in the astropy core package is that we don't want
> to put GUIs in the core because GUIs always carry lots of dependencies,
> which we don't want to be forced to deal with.  But part of the whole
> reason for affiliated packages was to get around this, so we're happy to
> see GUI-based affiliated packages.
>
>
> As for Pandas, to be totally honest, I don't see a huge amount to be
> gained from adding a Pandas dependency Astropy.  It's honestly not clear
> what it gives the astronomy community that numpy does not already have.
>  The following quote from the Pandas web site has guided me to that
> conclusion: "*pandas* helps fill this gap, enabling you to carry out your
> entire data analysis workflow in Python without having to switch to a more
> domain specific language like R."
>
> I have been carrying out my entire data analysis workflow for some time
> now in python without using Pandas.  It looks to me like Pandas is a tool
> that was written by and for statisticians who use R.  While we can take
> lessons from this, it's not clear we get much out of it in an astronomy
> context. For example, how would it make astropy's NDData, Quantity, or
> Table better to use a Pandas DataFrame vs. a numpy array? Most of what we
> are doing is building astronomy-convenient interfaces, and I'm not sure
> what Pandas adds there, at the cost of a pretty heavy-weight dependency.
>
> It could just be that I don't know enough about Pandas, though.  So if
> someone who knows Pandas better can speak to this, I'm all ears.
>
>
>
>
> On Tue, Jun 18, 2013 at 3:35 PM, Thøger Rivera-Thorsen <trive at astro.su.se>wrote:
>
>>  Pandas is a part of the newly-defined SciPy stack, after all, so that
>> would be part of any science-oriented distribution worth its salt. In fact,
>> I think it could be a good idea for astropy in general to use under the
>> hood, but again, could clash with the philosophy of the project and
>> possibly also maintainabillity.
>>
>> As for offering my code or just my experience, I'll have to square it
>> with my supervisor first, and I also think it depends on what direction the
>> project in question will take. I'm positive about the idea (which is why I
>> wrote in the first place), but supervisor might think it is a better idea
>> to actually get my paper in the project wrapped up before sending the code
>> out there. Will get back about that one!
>>
>> /Emil
>>
>>
>>
>>
>>
>> On 2013-06-18 20:53, Slavin, Jonathan wrote:
>>
>> Hi Emil,
>>
>>  That looks very nice!  I don't see Pandas as a big issue in terms of
>> dependencies.  I don't know that much about traits, etc.  My thought about
>> the gui was just based on my experience with matplotlib, and the fact that
>> it is widely used -- though I would agree that too many dependencies can be
>> a deterrent to people using something.  Are you offering your code as a
>> starting point for the project?  It strikes me that many have gotten some
>> sort of fitting package to a point of personal usability but no one has the
>> time/interest/motivation to make a more generally usable package.
>>
>>  Jon
>>
>>  On Tue, Jun 18, 2013 at 2:34 PM, <astropy-request at scipy.org> wrote:
>>
>>> Date: Tue, 18 Jun 2013 20:39:55 +0200
>>> From: Th?ger Rivera-Thorsen <thoger.emil at gmail.com>
>>> Subject: Re: [AstroPy] ESA Summer of Code in Space 2013
>>> To: astropy at scipy.org
>>> Message-ID: <51C0A97B.8090703 at gmail.com>
>>> Content-Type: text/plain; charset="iso-8859-1"
>>>
>>> I have been working on a fitting GUI for a while, although it is made
>>> with a specific task in mind.
>>> However, it is not based on Matplotlib but on Traits/Traitsui/Chaco and
>>> Pandas. It is made for a specific projhect I'm working and as such not
>>> yet usable for more general cases, but it could be a starting point, if
>>> the dependencies don't conflict with astropy politics.
>>>
>>> Especially, I am happy about the choice of Pandas for managing a quite
>>> complex data structure (the fitted and/or guessed values of an arbitrary
>>> number of transitions for an arbitrary number of rows or collapsed rows
>>> of a spatially resolved spectrum) of a), but also with the Traits-based
>>> interactive interface to build complex line profiles from single
>>> gaussians, good for fitting-by-eye and giving good initial guesses for
>>> fitting of complex line profiles. It hooks directly up to a wrapper I've
>>> made for lmfit, but given the modularity, it should be relatively easy
>>> to change to other backends.
>>>
>>> It's still a work-in-progress, but there are some screenshots here:
>>> http://flic.kr/s/aHsjGaEMGg .
>>> I know the choice and number of dependencies may be prohibitive but it
>>> saved a lot of work on the GUI, and Pandas means the difference between
>>> sanity and madness when it comes to keeping track of so many parameters.
>>>
>>> Cheers,
>>> Emil
>>>
>>
>>
>>
>>  ________________________________________________________
>> Jonathan D. Slavin                 Harvard-Smithsonian CfA
>> jslavin at cfa.harvard.edu       60 Garden Street, MS 83
>> phone: (617) 496-7981       Cambridge, MA 02138-1516
>> fax: (617) 496-7577            USA
>> ________________________________________________________
>>
>>
>>
>> _______________________________________________
>> AstroPy mailing listAstroPy at scipy.orghttp://mail.scipy.org/mailman/listinfo/astropy
>>
>>
>>
>> _______________________________________________
>> AstroPy mailing list
>> AstroPy at scipy.org
>> http://mail.scipy.org/mailman/listinfo/astropy
>>
>>
>
>
> --
> Erik
>
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> http://mail.scipy.org/mailman/listinfo/astropy
>
>


-- 
************************************
Chris Beaumont
Graduate Student
Institute for Astronomy
University of Hawaii at Manoa
2680 Woodlawn Drive
Honolulu, HI 96822
www.ifa.hawaii.edu/~beaumont
************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/astropy/attachments/20130620/4886db28/attachment.html>


More information about the AstroPy mailing list