I've just pushed some changes to the quad tree projection that should parallelize it automatically.
The old-style of projecting requires parallelization through a spatial decomposition in the 2D plane of the image. This results in two problems -- very poor load balancing in the current scheme and the inability to utilize this operation in situ, as it requires passing data around in a manner different from the simulation code's load balancing scheme. Furthermore, it can be slow.
About a year ago I implemented a quadtree projection mechanism that was about an order of magnitude faster for big datasets. Unfortunately, because of the more complicated nature of the datastructures, I never parallelized it. This last week I figured out how to do this, and then implemented this parallelization in the quad_proj object in yt. I've tested it and it gives very good results for both memory and speed; it's about an order of magnitude faster than the old-style projection for my datasets, and I have been unable to get it to scale since the time-to-completion is so low.
It would be great if some other people could test it, to see how well it scales for them. It should perform best in parallel where the spatial-decomposition will give poor results -- this is often with deeply nested hierarchies, or with refinement regions that do not cover the entire box. Additionally, if you are interested in testing it in situ, this is a good idea as well.
To use it, you can simply do:
pf.h.proj = pf.h.quad_proj
and do the normal PlotCollection, lightcone, etc etc operations, or you can manually create quad_proj objects:
qp = pf.h.quad_proj(0, "Density")
(for instance) and examine those and the time for those.
I would like to replace the old-style projection with this for the 2.2 release, if we can go back and forth and make sure it is up to snuff, so your testing is GREATLY appreciated to avoid any hiccups along the way.