A few of us (Austin Gilbert, Penny Qian, Matt Turk, John Zuhone, and myself) had a discussion by email about possible avenues for collaboration between glue and yt, and we have decided it would make more sense to discuss this in the open.
To get started, I've included copies of the relevant parts of the emails below (oldest at the top). Please feel free to chime in!
We're also planning to have a Google Hangout to discuss this - I've created a Doodle, which you can fill out if you want to join the discussion:
In addition to this list, we'll be using the yt project slack site (with a special channel for glue) to discuss some of these ideas day-to-day, so just reply to me off-list if you want to be included in the slack channel, and I'll pass on the requests to the yt team.
Austin Gilbert: I have heard through the yt project network that you are looking into making it compatible with glueviz. As a big fan of user interfaces for data interaction, this is really exciting to me, and I would like to be as helpful as possible. Could you tell me what work is needed in order to make yt fully functional with glue? Aside from existing widgets in the framework, are you planning on adding some specific to yt? Let me know, because I am eager to help and to get this up and running.
Many thanks for the exciting initiative for the collaboration! Regarding your first question, honestly it is not super clear to me what exactly should we do to interface yt and glue. Given that Glue is featured with a GUI, while yt is a powerful data analysis and visualization package. Could we probably explore the following possibilities:
As for the second question, were you talking about adding some specific widget into the Glue framework, so that Glue can render the data from yt?
Thanks Austin and Penny for getting the conversation started on a yt <-> glue collaboration! :)
One of the places where I think that it would be great to collaborate between yt and glue is to develop an abstract data layer in glue that better separates data access and computation from the interactive visualization, and leverage yt as a data access and computation layers. I'll describe a little what I mean by this below.
One of the main issues with glue currently is that it is:
Currently, glue loads data into Data classes, and viewers then access the data directly and do computations (for example calculate a histogram of all values). Calculating what sections of datasets fall inside subsets is also done outside of the data objects and is not done in a 'smart' way in that all the data has to be accessed, and the entire subset computed straight away.
In practice, what this means is that we have a FITS reader that understands memory mapping, but as soon as you do something like compute a histogram of all pixel values, all the data has to be read, and you lose the ability to deal with large files. Similarly, if the user makes a selection in the cube, often the whole cube has to be read in to determine which pixels have been selected.
A better mechanism would be to develop what I refer to as an abstract data/computation layer, which means that we define an API that any data object needs to have for data access, but also include things like computation of things like fixed resolution buffers, or selection of subsets. The idea would be that one can then implement a much wider variety of data objects - for example a data object that would behind the scenes be powered by yt, but also a data object that actually communicates with a remote computer cluster on which the data is stored.
The interactive visualization part of glue would then not need to worry about the details of the data access - it would essentially say 'I need a fixed resolution buffer with these dimensions', or 'I need a histogram', and this would be delegated to the data object.
Of course, yt is perfectly suited to this since it already provides a data abstraction layer - so this would be a matter of defining an API for glue data objects, then writing a wrapper for yt. In future, one could even imaging running glue on a laptop, and having a data object that communicates with a cluster that is running yt.
The end result would be that researchers could load up a large simulation in glue and be able to do the kind of linked data visualization that glue can normally do, which I think would be extremely powerful.
Of course, related to what Penny said, I think there are a couple of other avenues for collaboration:
When using the 3D viewers, we could have an 'export to yt' option which provides a yt script to produce a production-quality 3D visualization (the VisPy viewers we have look ok but I don't think the static output from these is anywhere near as nice as what yt can do). This would simply be a matter of writing a plugin for a yt exporter.
It would be fun to investigate the new yt OpenGL rendering and see how this compares to what we currently use (VisPy), and potentially develop a new viewer based on the yt OpenGL renderer.
I think it would be great to discuss all of these ideas, and would like to suggest that we have a Google Hangout in the short term. I'll send out another email with a link to a Doodle poll!
I think a google hangout would be a great way to get started and ensure we have a unified plan. Additionally, we should go ahead and move this to a public email list. I would also like to recommend a Slack channel for day to day communications; the yt community has been using it for a while now to great effect.
In regards to the data abstraction layer you have described, I think that YT is definitely well suited to working with data objects and selecting regions of data in the case of large file sets. The data objects currently supported in YT enable smart file reading: when you create a subregion of data, only that data is read from disk, so very large datasets are not entirely un-manageable Additionally, YT has a wide number of frontends for different data formats so incorporating it into data objects could enable a whole new community to utilize glue. I think YT could accomplish what you are thinking and glue can accomplish what I'm thinking.
On the user side, I want to make sure that if you incorporate YT, YT users still get the capabilities of the program they are used to working with alongside the linking capabilities and user interface that glue provides. For me this primarily looks like ensuring glue has the ability to utilize YT's standard plotting measures in some form of widget. I also like the idea of including the opengl features that yt can offer.
I will certainly let others in the yt community know about the hangout to discuss what could happen.