Mailman 3 June 2010 - yt-dev

code review request: data file initialization
by Matthew Turk 29 Jun '10

29 Jun '10

Hi all, Could I get one or two people to look this over, and give me the thumbs up? I applied a hotfix to the repo yesterday because of problems during the Enzo Workshop. It was trying to create the .yt file for every user, even though they didn't have write privs -- this was because of an indentation error, but then I committed one that was *also* not great (but worked to fix the workshop issues) and I have, I think, settled upon one that works, that explores the four conditionals (noted in the patch.) http://paste.enzotools.org/show/857/ If I get a thumbs up, I'll commit. -Matt

3 3

yt version 1.7 Release Announcement
by Britton Smith 27 Jun '10

27 Jun '10

We're proud to announce the release of yt version 1.7, an analysis and visualization toolkit for Adaptive Mesh Refinement data. (Just in time for the Enzo Workshop!) This release fixes a number of bugs, as well as numerous improvements to the code base and the documentation. Most prominently, it features "two-point functions" such as structure and correlation functions, a re-engineered volume rendering interface, multivariate volume rendering, off-axis projections, and a mechanism for complex postscript plot layout. Additionally, a major aspect of the drive to 1.7 has been re-engineering the API documentation to be better suited to interactive help (the "help(...)" call in python) as well as the documentation website ( http://yt.enzotools.org/doc/ ). Some of the changes since yt-1.6 (Released on January 22, 2010) include: * Direct writing of PNGs * Multi-band image writing * Parallel halo merger tree * Parallel structure function generator * Image pan and zoom object and display widget * Parallel volume rendering * Multivariate volume rendering, allowing for multiple forms of emission and absorption * Added Camera interface to volume rendering * Off-axis projection * Stereo (toe-in) volume rendering * DualEPS extension for better EPS construction * Rewritten, memory conservative and speed-improved contour finding algorithm * Speed improvements to volume rendering * Preliminary support for the Tiger code * Lightweight projection loading with projload * Improvements to yt.data_objects.time_series * Improvements to yt.extensions.EnzoSimulation * Speed improvements to basic HOP * Better docstrings and documentation (The full changelog: http://yt.enzotools.org/doc/changelog.html ) yt features native support for Enzo (http://lca.ucsd.edu/projects/enzo) data, providing a natural and intuitive way to address physical regions in space as well as processed data. Installation instructions can be found here: http://yt.enzotools.org/doc/installation.html . If you are running an older version of yt, re-obtaining and re-running the installation script should happily upgrade your installation. yt is a Free and Open Source project, and we invite you to get involved. For more information, join the yt-dev mailing list, or see the hacking guidelines on the Wiki: http://yt.enzotools.org/wiki/HackingGuidelines . We anticipate a few more releases in the 1.7 series as documentation and docstring coverage progress. Sincerely, The yt development team: Matthew Turk Stephen Skory Britton Smith John Wise Jeff Oishi Sam Skillman Devin Silvia David Collins

1 0

Quick note about SIGUSR2
by Matthew Turk 25 Jun '10

25 Jun '10

Hi guys, Just a quick note about signal handling. In the past, sending SIGUSR1 to a running yt-imported python task would print the stack and SIGUSR2 would make it raise a runtime error. I've changed this behavior, so that SIGUSR2 actually inserts an IPython session at the current location in the running task. When the IPython session terminates (ctrl-d) it will continue executing. This will let you do things like modify state variables, inspect state variables, etc etc. I haven't added a convenience function for listing the surrounding code, but that shouldn't be too bad to do, and so if somebody wants to take that on it'd be great. I'd suggest just adding that function into the locals of the IPython session. -Matt

1 0

1.7 Release
by Matthew Turk 25 Jun '10

25 Jun '10

Hi all, I'm going to try to push out a yt-1.7 release this week before Friday. I've added docstrings (in the form of doc/docstring_example) to the plot collection, fixed resolution buffers, and about 1/2 of the volume rendering stuff. I will be documenting the new volume rendering interface and adding more docstrings, and then I'm calling 1.7 good. John, can you put DualEPS into the main yt repo? If you have time -- and you probably don't :) -- would you mind adding docstrings for it? Stephen, what is the status of your modifications to the docs/code? Is there anything else that needs to happen? I would like it if we could convert all of the docstrings to the new format, but that will take a significant chunk of time and it shouldn't be required for 1.7. Does anyone else have anything they think should go in? -Matt

5 12

(slightly off topic) a very strange error message from numpy
by j s oishi 21 Jun '10

21 Jun '10

Hi All, This afternoon I had some unexpected data corruption (thanks, GPFS!) which led me to find a very odd error. I don't know if this holds true for Enzo data anywhere, since it uses HDF5 and all, but to read Orion data we rely on numpy's fromfile() function. It turns out that if you seek past the end of a file, then call fromfile() on that filehandle, it throws a MemoryError. This makes very little sense from an end-user perspective. I was hunting through yt's source looking for some kind of memory related error until i realized the datafile might be corrupt. I can wrap the fromfile in a try/except statement to catch the MemoryError, but my question to you guys is, should I file a ticket upstream with the Numpy people? Or does this make sense and I am simply overlooking something? Here's the simplest test case that shows the behavior [jsoishi@volans ~]$ touch crap [jsoishi@volans ~]$ python -c 'import numpy; fi = open("crap"); fi.seek(100); numpy.fromfile(fi,count=10)' 10 items requested but only 0 read Traceback (most recent call last): File "<string>", line 1, in <module> MemoryError Any thoughts? j

2 1

pyprof2html
by Matthew Turk 15 Jun '10

15 Jun '10

Hi all, A new version of pyprof2html was released: http://www.hexacosa.net/project/pyprof2html/ It now includes function callers. This is definitely a cool piece of software, and it is very useful for exploring profiling output. -Matt

1 0

plot_collection in hg
by Stephen Skory 10 Jun '10

10 Jun '10

Hi Matt, I think you made a couple mistakes in your conversions from kwargs in plot_collection.py. With these changes, it works for me. If you think it's OK, I'll commit and push. http://paste.enzotools.org/show/740/ _______________________________________________________ sskory(a)physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________

2 1

hg yt
by Matthew Turk 07 Jun '10

07 Jun '10

Hi all, The case changes in Clump.py -> clump.py have caused more trouble than I'd realized they would. I'm still in the process of working through this issue, but I think it'll be best fixed by renaming it to clump_handling.py, which I'm going to do... -Matt

1 0

Module renames
by Matthew Turk 07 Jun '10

07 Jun '10

Hi all, I've renamed a bunch of modules to clean up after a mistake I made a long time ago. Fortunately, this also coincides with fixing things to come in line with the style guide better. I'd prefer we do this in SVN trunk, so that I don't miss any renames from applying a patch from hg=>SVN. This was done in r Modules that share a name with a class *have* to be renamed, otherwise the import mechanism breaks in a number of cases, and I'm not sure I can work around this. Here's my list: PlotCollection.py => plot_collection.py FieldInfoContainer.py => field_info_container.py ObjectFindingMixin.py => object_finding_mixing.py Britton, can you verify these are okay: Clump.py => clump.py HaloProfiler.py => halo_profiler.py EnzoSimulation.py => enzo_simulation.py LightCone*.py => light_cone_*.py Stephen, can you verify: MergerTree.py => merger_tree.py Keep in mind that all auto-generated documentation, as well as reliable selective imports, are broken when the module is the same name as the class. So we can rename one or the other... And I apologize that I didn't catch any of these sooner. But I'm about 75% done with docstrings for PlotCollection.py, and this was breaking auto-generation of the docs. Thanks, Matt

3 8

yt documentation, standards, implementation
by Matthew Turk 05 Jun '10

05 Jun '10

Hi all, This last Friday I had a chance to talk to Tom Abel and Oliver Hahn (both CC'd on this message) about their experiences with using yt, and they brought up some points which I've now had a chance to think about, and which I find very interesting, certainly as something to discuss. Here are my notes on it, along with a proposal for moving forward. As a quick note, what really hit home that we need better documentation was trying to make a thin projection. The definition of what a 'source' could be wasn't there, there were no examples, and I had to go look at the source to figure out what the parameters were even called. I think that's not ... good. Python Inline Documentation =========================== One of the coolest things about Python is the help() function, which prints out the function signature and the contents of the doc string. In the source code, the docstring is inline in the function, like so: def some_function(a, b, c): """ This function does something. """ return a+b+c The output of help(some_function) would look like this: >>> help(some_function) Help on function some_function in module __main__: some_function(a, b, c) This function does something. >>> Generated Documentation ======================= The yt docs are generated using an extension to Sphinx called autodoc. What this does, as you can see by going to the API docs and clicking "view source" (which, counterintuitively, displays the doc source and not the source code of the functions) is at documentaion build time, pull all the docstrings from the source and render them in the document. Ideally, we would want something that renders nicely as well as looks good in the inline help -- and to maximize the detail without becoming encumbering. For most of the functions in yt that have docstrings, they have been written in a narrative style, with parameters inside asterisks, so that they would render nicely in the API docs: http://yt.enzotools.org/doc/modules/amrcode.html#yt-lagos-outputtypes-outpu… But, it's becoming clear that perhaps this is not the best approach. I think a combination of narrative and explicit parameter declaration would be better. The NumPy/SciPy projects have a CodingStandards description: http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines that covers docstrings, with a very detailed example of a completely filled out docstring here: http://svn.scipy.org/svn/numpy/trunk/doc/example.py As an example, the 'tensorsolve' function is defined here: http://svn.scipy.org/svn/numpy/trunk/numpy/linalg/linalg.py and the API docs are here: http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.tensorsolv… This looks great, I think. yt is a bit more class-oriented than NumPy, but I believe that we should strive for a similar level of detail as well as a similar style: presenting parameters, what those parameters can be, and a brief word on the return type. Ideal Type Of Documentation =========================== A few weeks ago, Tom and I were chatting and he mentioned to me a Pascal manual. In this manual, there was a single function on every page: a description, parameters (often repeated between functions, but explicitly listed for each), and an example. My first Unix manual was exactly like this, and I remember it being one of the best sets of documentation I've ever used. I believe this is the model NumPy and SciPy are striving for, as well. I think this is what yt should strive for, too. One page per class or function, with a description, parameters, and examples -- just like mentioned above. In doing so, I think that the online help -- which right now is sort of helpful, but not amazingly helpful, would become much more useful. The fact that on the mailing lists we get questions asking us about fundamental operations in yt is, I think, an indictment of the way it's presented. As the Enzo Workshop revs up, a couple of us will be writing talks about using Enzo, using yt, etc, and I think this is a time to harness that momentum to reorganize and rewrite some of the doc strings. Of course, I would take the lead on the initial rewrite, as I'm the one who wrote all the bad docstrings. What does everyone think about this? Action Items ============ (It wouldn't be a long email about procedures if we didn't use a buzzword like 'action items' :) Firstly: a vote and a request for comments. Do we want to agree on the NumPy standard for docstrings? What does everyone think about this idea, of a set of docstring guidelines, and trying to focus on a better set of API documentation, to be used both in generated form and inline via help()? If we can agree on the NumPy standard, I believe that I should be able to convert most of the docstrings with some relative ease; it's mostly going to be a matter of typing, copy/pasting, etc. I will copy a style guide into doc/, which will be largely taken from the NumPy style guide, but I will additionally add a document with examples for common strings: I would prefer we have a single, consistent manner for referring to things like AMR3DData as a source, for instance. I will then go through and convert all the doc strings that I am familiar with. This would leave us with three files: * Example docstring, which can be read in verbatim and edited. * List of yt idioms for cross-referencing and describing things. * File describing this standard, largely pulling from the NumPy standard. The next thing will be, going forward, how do we ensure that the doc strings are correctly inserted with new code? I am more guilty of this than I would care to admit (I sometimes fall into the camp of thinking that functions with well-named parameters are self-documenting, which is probably a mistake!) but I think having someone agree to review incoming changesets for documentation updates, and then to email the committer if they do not have a sufficient docstring. My inclination is to suggest that someone who already reviews incoming changesets to do this, which I think means either me, Sam or Stephen. Sam, would you be willing to take this on? It should be relatively straightforward. Additionally, would anyone volunteer to help me out with rewriting some of the existing docstrings? In particular, for code you have contributed? The End ======= I think that if we really take the docstrings seriously, then the documentation on the whole will vastly improve. I am in the process of rewriting some sections, removing the old-style tutorial and trying to better walk the user through the process of getting up and running. The current documentation has a lot of information, but it's not very good at getting people up and running in anything other than the most simple manner. I think that getting started on improving the docstrings will also help refocus efforts toward better documentation on the whole. And, I'd like to end by admitting culpability for the sorry state of the docstrings we currently have. But I think this might be good, in the long run, because it'll help out with getting us on track for a better code that's much easier to use! And finally, thanks to Tom and Oliver for taking the time to chat with me about this -- I really appreciate their thoughtful feedback on this. Best, Matt

4 7