Hi Rocky,

That's still not great, but it does confirm my suspicion that scaling will improve with bigger data. Hopefully, that will continue to even larger datasets. Additionally, if you have more than one dataset to analyze, you can also parallelize your loop over datasets with the parallel_objects and piter commands.
https://yt-project.org/docs/dev/analyzing/parallel_computation.html#parallelizing-over-multiple-objects
This may work better for you since it will use significantly less communication between processes. Also, again, you'll be limited by the performance of your filesystem. In practice, I find that 8-16 processes is the absolute most I can use efficiently in parallel on your average computing cluster.

Britton

On Wed, Jul 17, 2019 at 2:52 PM Nathan <nathan.goldbaum@gmail.com> wrote:
yt.load() doesn't do any I/O beyond reading some simulation metadata so there's likely still I/O contention happening in your script.

On Wed, Jul 17, 2019 at 9:43 AM Tseng, Po-Hsun <zengbs@gmail.com> wrote:
Hi Britton,

Thanks for your prompt reply. I followed your advice that move I/O step outside the duration of timer[1] and make the total time longer, and then I obtain the scaling [2].  Look at figure[2], we can see that the performance of 2 cores is only 124/112~1.1 times better than that of 1 core. Does this make sense? Hope I really miss something because I have a huge data to be analysed. Thanks.

[1] http://paste.yt-project.org/show/152/
[2] http://i.imgur.com/3vALOL1.png

Rocky
_______________________________________________
yt-users mailing list -- yt-users@python.org
To unsubscribe send an email to yt-users-leave@python.org
_______________________________________________
yt-users mailing list -- yt-users@python.org
To unsubscribe send an email to yt-users-leave@python.org