I've just pushed some changes to the quad tree projection that should
parallelize it automatically.
The old-style of projecting requires parallelization through a spatial
decomposition in the 2D plane of the image. This results in two
problems -- very poor load balancing in the current scheme and the
inability to utilize this operation in situ, as it requires passing
data around in a manner different from the simulation code's load
balancing scheme. Furthermore, it can be slow.
About a year ago I implemented a quadtree projection mechanism that
was about an order of magnitude faster for big datasets.
Unfortunately, because of the more complicated nature of the
datastructures, I never parallelized it. This last week I figured out
how to do this, and then implemented this parallelization in the
quad_proj object in yt. I've tested it and it gives very good results
for both memory and speed; it's about an order of magnitude faster
than the old-style projection for my datasets, and I have been unable
to get it to scale since the time-to-completion is so low.
It would be great if some other people could test it, to see how well
it scales for them. It should perform best in parallel where the
spatial-decomposition will give poor results -- this is often with
deeply nested hierarchies, or with refinement regions that do not
cover the entire box. Additionally, if you are interested in testing
it in situ, this is a good idea as well.
To use it, you can simply do:
pf.h.proj = pf.h.quad_proj
and do the normal PlotCollection, lightcone, etc etc operations, or
you can manually create quad_proj objects:
qp = pf.h.quad_proj(0, "Density")
(for instance) and examine those and the time for those.
I would like to replace the old-style projection with this for the 2.2
release, if we can go back and forth and make sure it is up to snuff,
so your testing is GREATLY appreciated to avoid any hiccups along the