I appreciate your thoughts on this. I am a little leery of jumping into a full-fledged effort to fix the docs until we identify very concrete areas that need improvement, as you have already done it seems with this latest PR. 

But, there are some things that were in the survey that we are just not able to handle except on a case-by-case basis (e.g., if I run an example notebook on my large dataset, it fails in some unexpected way). Also, I feel like expecting code that is still in a very fluid state of development to have extensive documentation is unrealistic. We just don’t have the resources for that (it’s beyond “not wanting” to do it), and my experience has been that as I’m developing new functionality I will decide that the way I thought it should be done last week was actually pretty terrible, and that now it should be done this way. These things take time and experience to shake out properly, and it ought to be just a given that newly-released or still-in-development code is going to be buggy and incompletely documented and that it is up to the end-user to do some basic sanity checks on their results to make sure that they’re not nonsense.  For example, if I haven’t checked to make sure my simulation conserves energy (if it’s supposed to, anyway) before I send the paper out, then it’s on me. 

That said, any area of prominent/important/stable functionality that is poorly documented or is documented incorrectly is definitely on us—which is why having docs and docstrings that can be tested themselves is very important. 

I’m not making excuses for us, but I do think we need to be realistic. 


On Dec 18, 2014, at 5:37 PM, Cameron Hummels <> wrote:

On Thu, Dec 18, 2014 at 1:56 PM, Matthew Turk <> wrote:

So, in light of that, what do we do?  It is obvious that we need:

 * Work on the transition from yt 2 to yt 3, particularly in analysis modules
 * Work on documentation, although I must confess from reading the responses it's not clear to me there is a "right answer" to the question of docs, as there are lots of contradictory opinions.  Not to say we don't know a lot of things that can be improved, but I was surprised at the lack of consensus on the overall problems with the docs.
 * Better tracking of issues that are outstanding in the community.  The issue tracker can be effective if we use it, but right now it is not followed with regularity.  When I personally sit down to do work on yt, I sit down to advance things that I have tracked personally, which often include bugs but also features, and I don't always say, "What is an easy bug to check the box on?"

I think these are clear areas of work.  I've tried to address many of the shortcomings with the documentation in my big PR I've been working on over the last three weeks, but I think improving the documentation will have to be a process we all have to be vigilant about as we move forward.  I think that means including substantial comments in contributed sourcecode such that it's possible to figure out what is going on, including docstrings on outward-facing functions, documenting new features when they're added, and including notes in the docs when a bug is discovered but not yet fixed.  All of those things were named in the survey responses.  I think that we've had some improvement on this since the release of 3.0, but we'll have to continue to work at it.

I personally do sometimes go to the issue tracker and look for things I can address.  If everyone does that when they have a little downtime once in a while, I imagine we could burn through a bunch of them.  
I read a paper by Stan Ahalt a while back about the agile process as applied to the water science center and their science-focused software development.  It involved frequent communication with the community and short bursts of activity.  In that case, they had a local team and a clearly-defined set of responsibilities; we have neither.

We've tried in the past having testing leaders, doc leaders, and on and on.  It sometimes works, sometimes doesn't.  And sometimes it makes those people bear the brunt of annoyance from other devs.  What should we try now?  Where are the weak points in our infrastructure, particularly those weak points that can be fixed without intruding on the lives and careers of the members of the community of developers?

I'm not sure of a way to make people do things they don't want to do most of the time (e.g. docs and testing).  I am giving up the role of docs czar because of the strife it has caused.  I'd rather have friends in the community than well-written documentation.  I don't know of a better solution, but the current system at least re: docs does not work.  But I think ultimately the problem comes down to "how do you make someone do something for the betterment of the community when there is little personal incentive and time is short?"  I'm not trying to be cynical, I don't think there is an easy answer to that.

What can we do better?  Do we need stronger leadership?  Weaker leadership?  Holding up new PRs?  An internalization or codification of values?  Rundowns of issue tracking, perhaps in an automated way?  More frequent, lower barrier-to-entry meetings where we go over things?  Should we call upon an external advisory board?

I also want to take a moment to discuss the yt 3 transition and to publicly eat crow about how that went.  The release came at a time for me when I'd been putting an enormous amount of effort into the code in an attempt to cut it off and release it before various things happened in my outside-of-yt life.  I was unsuccessful in that regard (which just made me want the emotional burden of a pending release gone even more), but the release went out shortly thereafter anyway.  And I take responsibility, because while in many ways it was well-vetted and robust (and I still believe it will be useful for growing our community), in other ways that were crucial to the *existing* community, particularly people who have been around for years and years, it was not sufficient.  And, it was my fault.  Disruption was inevitable and necessary, since we had to right some wrongs from past yt development, and I think we are recovering, but it would be nice if we could have sidestepped it a bit more.


On Tue, Dec 16, 2014 at 9:06 PM, Cameron Hummels <> wrote:
Fellow yt users:

A few weeks ago, we asked yt users and developers to fill out a brief survey ( to provide us with feedback on how well yt is meeting the needs of the community and how we can improve.  Thank you to all 39 people who responded, as it has given us a great deal to consider as we move forward with the code.  We summarize the results of the survey below, but I start with the basic takeaway from the survey:

Overall Survey Takeaway:
The survey respondents are generally pleased with yt.  It meets their needs, has a wonderful community, is relatively easy to install, and has fair documentation.  Major short-term requests were for improvements in documentation, particularly in API docs and source code commenting, as well as more cross-linking in the existing documentation and making sure docs were up to date.  Furthermore, people wanted more attention to making sure existing code in 3.0 works and for resurrecting all 2.x functionality in 3.0.

The single biggest takeaway from the survey is that the transition to yt 3.0 has been fraught with difficulties.  Many submitters expressed satisfaction with the new functionality in 3.0, but the overall process of transition through documentation, analysis modules and community response has been found to be lacking.

There were 39 unique responses to our survey.  75% of the respondents were grads and postdocs with a smattering of faculty, undergrads, and researchers.  Nearly everyone is at 4-year universities.  50% of the respondents consider themselves intermediate users, 20% novice, 20% advanced, and 10% gurus.

90% of the respondents use the standalone install script, with a several users employing other methods (potentially in addition to the standalone script).  95% of the respondents rated installation as a 3 or better (out of 5) with most people settling on a 4 out of 5.  Installation comments were aimed at having better means of installing on remote supercomputing systems and/or making pip installs work more frequently.

Community Responsiveness:
72% of respondents gave yt 5 out of 5 and 97% were 3 or greater for community responsiveness.  Clearly this is our strong point.  There was a very wide distribution of ways in which people contacted the community for help with the most popular means being the mailing lists, the irc channel, mailing developers directly, and searching google.  Comments in this section were mostly positive, but one user wished for more concrete action to be taken after bugs were reported.

77% of respondents gave 4 or 5 out of 5 for the overall rating of the documentation.  Individual docs components were more of a mix.  Cookbooks were ranked very highly, and quickstart notebooks and narrative docs were generally ranked well.  The two documentation components that seemed be ranked lower (although still fair) were API docs and comments in the source code with 15% of respondents noting that they were “mostly not useful” (ie 2 / 5).  There were a lot of comments regarding ways to improve the docs, which I bullet point here:
  • Organization of docs is difficult to parse; Hard to find what you’re looking for.
  • Hard to know what to search for, so make command list (ie API docs) more prominent
  • Docs not always up to date (even between 3.0 and dev)
  • Discrepancies between API docs and narrative docs
  • Examples are either too simple or too advanced--need more intermediate examples
  • Units docs need more explanation
  • Not enough source code commenting or API docs
  • Not enough cross-linking between docs
  • More FAQ / Gotchas for common mistakes
  • API docs should include more examples and also note how to use all of the options, not just the most common.

88% of respondents found yt to meet their research needs (4 or 5 out of 5).  Respondents are generally using yt on a variety of datasets including grid data, octree data, particle data, and MHD with only a handful of users dealing with spherical or cylindrical data at present.  Nearly all of the frontends are being used by respondents, with a few exceptions: Chombo, Moab, Nyx, Pluto, and non-astro data.  Visualization remains the main use of yt with 97% of respondents, but simple analysis received 82% and advanced analysis received 62%.  Interestingly, 31% of respondents use halo analysis tools, with only 15% using synthetic observation analysis.

Big Picture:
51% of respondents gave yt 5 out of 5 for general satisfaction, with 28% 4 out of 5 and 15% 3 out of 5.  Overall, this is pretty good but probably biased by the fact that people filled out this survey.  Comments on the greatest strengths of yt include:
  • visualization capabilities
  • community support
  • flexibility
Comments on the biggest shortcomings of yt include:
  • documentation (see above)
  • learning to “think in yt”
  • making new functionality when there is existing broken functionality (or missing documentation)
  • making sure 3.0 matches all functionality from 2.x
  • keeping the documentation up to date
  • making the transition from 2.x to 3.0 easier (how to update scripts)
Things to focus on in the next year:
  • documentation (almost unanimously)
  • making sure 3.0 can do all functionality from 2.x

Thank you for all of the valuable feedback.  We sincerely appreciate the constructive criticism in making for a better code and community!  We will put together a blueprint of how to address these shortcomings soon.  Look for it after the holiday break.  Have a wonderful holiday!

On behalf of the yt development team,


Cameron Hummels
Postdoctoral Researcher
Steward Observatory
University of Arizona

yt-dev mailing list

yt-dev mailing list

Cameron Hummels
Postdoctoral Researcher
Steward Observatory
University of Arizona
yt-dev mailing list