14-hour load time for Enzo dataset with VR, vs. 30 minutes with ProjectionPlot?
Hello yt people, We're trying to render imagery of a pretty large Enzo snapshot (~160GB, in 330,000 grids in 512 HDF5 domains) with yt-3.3dev. On a reasonably fast Linux machine, we can do a ProjectionPlot of a few variables in about 30 minutes, running single-threaded while it scans the data (which is what takes most of the time). Data access pattern: we see it reading through each of the HDF5 files in numerical order (cpu0000, cpu0001, ...), taking a few seconds each, and opening each file exactly once. On the same machine and same dataset, using the volume rendering API, the data-scanning process takes about*14 hours* (not counting any rendering time). (On Blue Waters, Kalina using a similar dataset couldn't get it to finish within a 24-hour wall-clock limit.) Data access pattern: it opens an HDF5 file many times in quick succession, then opens another, then opens the previous file a bunch more times. I'm guessing it grabs one AMR grid from each HDF5 open: open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0235", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 This is trouble. Is there anything we can do to make load times less extravagant when using VR on Enzo? What if we ran "ds.index" before I tried running cProfile on it, as in python -m cProfile myscript.py ... Happy to point anyone at the dataset on our systems or BW, but at this scale it's not a very portable problem.
I don't think too many people have done a volume rendering this big, so
you're likely hitting scaling issues that haven't been looked at closely.
Have you tried doing any sort of parallel volume rendering? yt supports
decomposing in the image plane in parallel using the MosaicCamera.
-Nathan
On Thu, Mar 3, 2016 at 3:38 PM, Stuart Levy
Hello yt people,
We're trying to render imagery of a pretty large Enzo snapshot (~160GB, in 330,000 grids in 512 HDF5 domains) with yt-3.3dev.
On a reasonably fast Linux machine, we can do a ProjectionPlot of a few variables in about 30 minutes, running single-threaded while it scans the data (which is what takes most of the time). Data access pattern: we see it reading through each of the HDF5 files in numerical order (cpu0000, cpu0001, ...), taking a few seconds each, and opening each file exactly once.
On the same machine and same dataset, using the volume rendering API, the data-scanning process takes about* 14 hours* (not counting any rendering time). (On Blue Waters, Kalina using a similar dataset couldn't get it to finish within a 24-hour wall-clock limit.) Data access pattern: it opens an HDF5 file many times in quick succession, then opens another, then opens the previous file a bunch more times. I'm guessing it grabs one AMR grid from each HDF5 open:
open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0235", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3
This is trouble. Is there anything we can do to make load times less extravagant when using VR on Enzo? What if we ran "ds.index" before
I tried running cProfile on it, as in python -m cProfile myscript.py ... Happy to point anyone at the dataset on our systems or BW, but at this scale it's not a very portable problem.
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
Hi Stuart,
My guess is that this is all related to ghost zones. This has been a
really bothersome issue with yt 3 that hasn't been addressed yet;
there is a non-linear, truly terrible slowdown from getting ghost
zones in 3 compared to 2. So if you turn off ghost zones in your VR,
I suspect it'd go a lot faster. Unfortunately, it'll look at lot
worse. There is hope, but unknown timescale for fixing it.
On Thu, Mar 3, 2016 at 3:47 PM, Nathan Goldbaum
I don't think too many people have done a volume rendering this big, so you're likely hitting scaling issues that haven't been looked at closely.
Have you tried doing any sort of parallel volume rendering? yt supports decomposing in the image plane in parallel using the MosaicCamera.
-Nathan
On Thu, Mar 3, 2016 at 3:38 PM, Stuart Levy
wrote: Hello yt people,
We're trying to render imagery of a pretty large Enzo snapshot (~160GB, in 330,000 grids in 512 HDF5 domains) with yt-3.3dev.
On a reasonably fast Linux machine, we can do a ProjectionPlot of a few variables in about 30 minutes, running single-threaded while it scans the data (which is what takes most of the time). Data access pattern: we see it reading through each of the HDF5 files in numerical order (cpu0000, cpu0001, ...), taking a few seconds each, and opening each file exactly once.
On the same machine and same dataset, using the volume rendering API, the data-scanning process takes about 14 hours (not counting any rendering time). (On Blue Waters, Kalina using a similar dataset couldn't get it to finish within a 24-hour wall-clock limit.) Data access pattern: it opens an HDF5 file many times in quick succession, then opens another, then opens the previous file a bunch more times. I'm guessing it grabs one AMR grid from each HDF5 open:
open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0235", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3
This is trouble. Is there anything we can do to make load times less extravagant when using VR on Enzo? What if we ran "ds.index" before
I tried running cProfile on it, as in python -m cProfile myscript.py ... Happy to point anyone at the dataset on our systems or BW, but at this scale it's not a very portable problem.
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
We haven't tried parallelizing. We could do that. But the main problem is, why should it take 30x longer using the volume-rendering pathway than using ProjectionPlot, both of which should need to examine all the data (right?)? On 3/3/16 3:47 PM, Nathan Goldbaum wrote:
I don't think too many people have done a volume rendering this big, so you're likely hitting scaling issues that haven't been looked at closely.
Have you tried doing any sort of parallel volume rendering? yt supports decomposing in the image plane in parallel using the MosaicCamera.
-Nathan
On Thu, Mar 3, 2016 at 3:38 PM, Stuart Levy
mailto:salevy@illinois.edu> wrote: Hello yt people,
We're trying to render imagery of a pretty large Enzo snapshot (~160GB, in 330,000 grids in 512 HDF5 domains) with yt-3.3dev.
On a reasonably fast Linux machine, we can do a ProjectionPlot of a few variables in about 30 minutes, running single-threaded while it scans the data (which is what takes most of the time). Data access pattern: we see it reading through each of the HDF5 files in numerical order (cpu0000, cpu0001, ...), taking a few seconds each, and opening each file exactly once.
On the same machine and same dataset, using the volume rendering API, the data-scanning process takes about*14 hours* (not counting any rendering time). (On Blue Waters, Kalina using a similar dataset couldn't get it to finish within a 24-hour wall-clock limit.) Data access pattern: it opens an HDF5 file many times in quick succession, then opens another, then opens the previous file a bunch more times. I'm guessing it grabs one AMR grid from each HDF5 open:
open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0235", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3
This is trouble. Is there anything we can do to make load times less extravagant when using VR on Enzo? What if we ran "ds.index" before
I tried running cProfile on it, as in python -m cProfile myscript.py ... Happy to point anyone at the dataset on our systems or BW, but at this scale it's not a very portable problem.
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org mailto:yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.spacepope.org_listinfo.cgi_yt-2Dusers-2Dspacepope.org&d=BQMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=hgcBC3x6dKFoTrmFmMYYbKNfiHZlGLKliIidd1LwmHI&m=UyLvCYKpiXUBKLrqXhtKCwKz9CWKL7otbw6OkO6Ci-Q&s=_mbv1RRR0JtnLehEjg-lWjUDf2X8i12-iVZPu8UHSns&e=
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
When you make a projection plot, you don't use ghost zones. The ghost
zone algorithm is pretty bad in the new version, but it was the best I
could do at the time. I'd be happy to go over this in detail, or
write it up, if that'd help.
On Thu, Mar 3, 2016 at 3:52 PM, Stuart Levy
We haven't tried parallelizing. We could do that. But the main problem is, why should it take 30x longer using the volume-rendering pathway than using ProjectionPlot, both of which should need to examine all the data (right?)?
On 3/3/16 3:47 PM, Nathan Goldbaum wrote:
I don't think too many people have done a volume rendering this big, so you're likely hitting scaling issues that haven't been looked at closely.
Have you tried doing any sort of parallel volume rendering? yt supports decomposing in the image plane in parallel using the MosaicCamera.
-Nathan
On Thu, Mar 3, 2016 at 3:38 PM, Stuart Levy
wrote: Hello yt people,
We're trying to render imagery of a pretty large Enzo snapshot (~160GB, in 330,000 grids in 512 HDF5 domains) with yt-3.3dev.
On a reasonably fast Linux machine, we can do a ProjectionPlot of a few variables in about 30 minutes, running single-threaded while it scans the data (which is what takes most of the time). Data access pattern: we see it reading through each of the HDF5 files in numerical order (cpu0000, cpu0001, ...), taking a few seconds each, and opening each file exactly once.
On the same machine and same dataset, using the volume rendering API, the data-scanning process takes about 14 hours (not counting any rendering time). (On Blue Waters, Kalina using a similar dataset couldn't get it to finish within a 24-hour wall-clock limit.) Data access pattern: it opens an HDF5 file many times in quick succession, then opens another, then opens the previous file a bunch more times. I'm guessing it grabs one AMR grid from each HDF5 open:
open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0235", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3
This is trouble. Is there anything we can do to make load times less extravagant when using VR on Enzo? What if we ran "ds.index" before
I tried running cProfile on it, as in python -m cProfile myscript.py ... Happy to point anyone at the dataset on our systems or BW, but at this scale it's not a very portable problem.
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
For reference, I've definitely noticed that doing VR or
off_axis_projections with moderately sized datasets take about 5-10x longer
than ProjectionPlots of the same dataset. Last week I was playing with a
non-cosmological enzo dataset that is 1.6GB in total size, and it took
about 3-4 minutes to do an off_axis_projection, whereas it took about 20
seconds to do a ProjectionPlot.
On Thu, Mar 3, 2016 at 1:47 PM, Nathan Goldbaum
I don't think too many people have done a volume rendering this big, so you're likely hitting scaling issues that haven't been looked at closely.
Have you tried doing any sort of parallel volume rendering? yt supports decomposing in the image plane in parallel using the MosaicCamera.
-Nathan
On Thu, Mar 3, 2016 at 3:38 PM, Stuart Levy
wrote: Hello yt people,
We're trying to render imagery of a pretty large Enzo snapshot (~160GB, in 330,000 grids in 512 HDF5 domains) with yt-3.3dev.
On a reasonably fast Linux machine, we can do a ProjectionPlot of a few variables in about 30 minutes, running single-threaded while it scans the data (which is what takes most of the time). Data access pattern: we see it reading through each of the HDF5 files in numerical order (cpu0000, cpu0001, ...), taking a few seconds each, and opening each file exactly once.
On the same machine and same dataset, using the volume rendering API, the data-scanning process takes about* 14 hours* (not counting any rendering time). (On Blue Waters, Kalina using a similar dataset couldn't get it to finish within a 24-hour wall-clock limit.) Data access pattern: it opens an HDF5 file many times in quick succession, then opens another, then opens the previous file a bunch more times. I'm guessing it grabs one AMR grid from each HDF5 open:
open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0235", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3
This is trouble. Is there anything we can do to make load times less extravagant when using VR on Enzo? What if we ran "ds.index" before
I tried running cProfile on it, as in python -m cProfile myscript.py ... Happy to point anyone at the dataset on our systems or BW, but at this scale it's not a very portable problem.
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
-- Cameron Hummels NSF Postdoctoral Fellow Department of Astronomy California Institute of Technology http://chummels.org
Hi Stuart,
I've started looking into this, and I've made some progress that may
help for your particular use case.
https://bitbucket.org/yt_analysis/yt/pull-requests/2025
-Matt
On Thu, Mar 3, 2016 at 3:38 PM, Stuart Levy
Hello yt people,
We're trying to render imagery of a pretty large Enzo snapshot (~160GB, in 330,000 grids in 512 HDF5 domains) with yt-3.3dev.
On a reasonably fast Linux machine, we can do a ProjectionPlot of a few variables in about 30 minutes, running single-threaded while it scans the data (which is what takes most of the time). Data access pattern: we see it reading through each of the HDF5 files in numerical order (cpu0000, cpu0001, ...), taking a few seconds each, and opening each file exactly once.
On the same machine and same dataset, using the volume rendering API, the data-scanning process takes about 14 hours (not counting any rendering time). (On Blue Waters, Kalina using a similar dataset couldn't get it to finish within a 24-hour wall-clock limit.) Data access pattern: it opens an HDF5 file many times in quick succession, then opens another, then opens the previous file a bunch more times. I'm guessing it grabs one AMR grid from each HDF5 open:
open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0235", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3
This is trouble. Is there anything we can do to make load times less extravagant when using VR on Enzo? What if we ran "ds.index" before
I tried running cProfile on it, as in python -m cProfile myscript.py ... Happy to point anyone at the dataset on our systems or BW, but at this scale it's not a very portable problem.
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
Thank you so much, Matt! I've been running some tests today with your new code on the full data set - will let you know how they turn out. On 3/5/16 5:16 PM, Matthew Turk wrote:
Hi Stuart,
I've started looking into this, and I've made some progress that may help for your particular use case.
https://bitbucket.org/yt_analysis/yt/pull-requests/2025
-Matt
On Thu, Mar 3, 2016 at 3:38 PM, Stuart Levy
wrote: Hello yt people,
We're trying to render imagery of a pretty large Enzo snapshot (~160GB, in 330,000 grids in 512 HDF5 domains) with yt-3.3dev.
On a reasonably fast Linux machine, we can do a ProjectionPlot of a few variables in about 30 minutes, running single-threaded while it scans the data (which is what takes most of the time). Data access pattern: we see it reading through each of the HDF5 files in numerical order (cpu0000, cpu0001, ...), taking a few seconds each, and opening each file exactly once.
On the same machine and same dataset, using the volume rendering API, the data-scanning process takes about 14 hours (not counting any rendering time). (On Blue Waters, Kalina using a similar dataset couldn't get it to finish within a 24-hour wall-clock limit.) Data access pattern: it opens an HDF5 file many times in quick succession, then opens another, then opens the previous file a bunch more times. I'm guessing it grabs one AMR grid from each HDF5 open:
open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0074", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0075", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0235", O_RDONLY) = 3 open("/fe0/deslsst/renaissance/normal/RD0074/RedshiftOutput0074.cpu0357", O_RDONLY) = 3
This is trouble. Is there anything we can do to make load times less extravagant when using VR on Enzo? What if we ran "ds.index" before
I tried running cProfile on it, as in python -m cProfile myscript.py ... Happy to point anyone at the dataset on our systems or BW, but at this scale it's not a very portable problem.
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
participants (4)
-
Cameron Hummels
-
Matthew Turk
-
Nathan Goldbaum
-
Stuart Levy