
Sure, I am happy to be a guinea pig. I will need to install that clone on BW, so far I was using the system-wide version of yt. On 3/13/2018 6:06 PM, Britton Smith wrote:
Hi Nick,
I just issued a pull request that should cut out the largest scaling bottleneck in the yt FOF/HOP halo finders. The PR can be found here: https://github.com/yt-project/yt/pull/1724
As noted in the description, the one change to the results is that the final halo catalog will no longer be sorted by mass when it's written. Would it be possible for you to test out this modification?
Britton
On Tue, Mar 13, 2018 at 3:11 PM, Britton Smith <brittonsmith@gmail.com <mailto:brittonsmith@gmail.com>> wrote:
Hi Nick,
Sorry, I've not seen that error before.
It looks like the very latest Rockstar supports ART natively, but it may have the same issue. That code is here: https://bitbucket.org/pbehroozi/rockstar-galaxies <https://bitbucket.org/pbehroozi/rockstar-galaxies>
Britton
On Tue, Mar 13, 2018 at 2:21 PM, Nick Gnedin <ngnedin@gmail.com <mailto:ngnedin@gmail.com>> wrote:
Britton,
That helps, and Rockstar starts and loads data, but now it gives an internal error:
P001 yt : [INFO ] 2018-03-13 16:03:11,005 Created 2048 chunks for ARTIO P001 yt : [WARNING ] 2018-03-13 16:05:09,506 Total Particle Count: 1.342e+08 [Error] Couldn't open # dsname index :strict! (Err: Servname not supported for ai_socktype)
That error comes from the function default_addrinfo(...) in rockstar source code, file inet/socket.c.
I don't know how well you guys know rockstar - let me know if you think this is the dead end.
n
On 03/13/2018 03:18 PM, Britton Smith wrote:
Hi Nick,
You might need to add the "num_readers" and "num_writers" keywords to this line: rhf = RockstarHaloFinder(d, num_readers=8, num_writers=8) The readers are the i/o nodes and the writers are the actual halo finding instances. 8 for each is a good place to start. When you run, you'll need to have (num_readers + num_writers + 1) MPI processes, where the extra one is for the server.
Hopefully, that takes care of it.
Britton
On Tue, Mar 13, 2018 at 12:43 PM, Nick Gnedin <ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>>> wrote:
Nathan,
Thank you. I followed the online instructions and now Rockstar starts, but gives me an error, irrespective of whether I call it directly or via a HaloCatalog object:
---------- code --------- d = yt.load(root+"/rei20_a"+aexp+"/rei20_a"+aexp+".art") rhf = RockstarHaloFinder(d) rhf.run() -------------------------
Traceback (most recent call last): File "hfc.py", line 15, in <module> rhf = RockstarHaloFinder(d) File
"/scratch/midway2/gnedin/TMP/yt/yt/analysis_modules/halo_finding/rockstar/rockstar.py", line 230, in __init__ self.pool, self.workgroup = self.runner.setup_pool() File
"/scratch/midway2/gnedin/TMP/yt/yt/analysis_modules/halo_finding/rockstar/rockstar.py", line 105, in setup_pool (self.num_writers, "writers") ] File
"/scratch/midway2/gnedin/TMP/yt/yt/utilities/parallel_tools/parallel_analysis_interface.py", line 400, in from_sizes pool.add_workgroup(size, name = name) File
"/scratch/midway2/gnedin/TMP/yt/yt/utilities/parallel_tools/parallel_analysis_interface.py", line 368, in add_workgroup group = self.comm.comm.Get_group().Incl(ranks) AttributeError: 'NoneType' object has no attribute 'Get_group'
On 03/13/2018 02:01 PM, Nathan Goldbaum wrote:
On Tue, Mar 13, 2018 at 1:52 PM, Nick Gnedin <ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>>>> wrote:
Britton,
I am trying to run Rockstar, but it does not seem to be packaged with yt by default, import fails:
Traceback (most recent call last): File "hfc.py", line 3, in <module> from yt.analysis_modules.halo_finding.rockstar.api import RockstarHaloFinder File
"/home/gnedin/anaconda3/lib/python3.6/site-packages/yt/analysis_modules/halo_finding/rockstar/api.py", line 16, in <module> from .rockstar import RockstarHaloFinder File
"/home/gnedin/anaconda3/lib/python3.6/site-packages/yt/analysis_modules/halo_finding/rockstar/rockstar.py", line 28, in <module> from . import rockstar_interface ImportError: cannot import name 'rockstar_interface'
I think I am using the latest versions of python and yt:
yt module located at:
/home/gnedin/anaconda3/lib/python3.6/site-packages
The current version of yt is:
--- Version = 3.4.1 ---
It is not packaged with yt out of the box. The easiest way to get a copy of yt built with the rockstar bindings is to install yt with the install script. You'll need to modify the script so that INST_ROCKSTAR=1 once you've downloaded it.
Altenatively you can manually build yt and rockstar following the instructions in the docs here:
http://yt-project.org/docs/dev/installing.html#installing-support-for-the-ro... <http://yt-project.org/docs/dev/installing.html#installing-support-for-the-ro...>
<http://yt-project.org/docs/dev/installing.html#installing-support-for-the-ro... <http://yt-project.org/docs/dev/installing.html#installing-support-for-the-rockstar-halo-finder>>
The rockstar halo finder is licensed under GPLv3 so unfortunately we can't distribute it with the yt binaries on pypi or conda-forge without changing their license as well.
n
On 03/13/2018 12:44 PM, Britton Smith wrote:
Hi Nick,
Thanks for your report. Your timing data confirms my suspicion about which part of the code isn't scaling. The rejoining of the halo list after the halo finder is run makes heavy use of MPI broadcast calls. Reworking this shouldn't be too difficult, just a question of someone finding the time. If anyone is interested in trying to fix this, I can direct them to the places that need the attention.
Nick, in the mean time, you might try the Rockstar halo finder (either the one built-in to yt or the standalone), which scales quite well. The output from both Rockstar versions is loadable with yt.
Britton
On Tue, Mar 13, 2018 at 8:46 AM, Nick Gnedin <ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>>> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>>
<mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>>>>>
wrote:
This is just a notice to the developer.
I have run a HOP halo finder for a large ART simulation (1024^3 particles) on BlueWaters with the following code:
import yt from yt.analysis_modules.halo_analysis.api import HaloCatalog yt.enable_parallelism() path = "/mnt/c/scratch/sciteam/ngnedin/PERM/B40/D" aexps = [ "0.1280", "0.1203", "0.1115", "0.1002", "0.0907" ] for aexp in aexps: d = yt.load(path+"/rei40_a"+aexp +"/rei40_a"+aexp+".art") hc =
HaloCatalog(data_ds=d,finder_method='hop',output_dir=path+"/a="+aexp+"/hop",finder_kwargs={"dm_only":False,"ptype":"N-BODY"}) hc.create()
Because of memory constraints, I have to run it on at least 4 MPI ranks, and I noticed that yt implementation of HOP does not scale - a 4-rank job takes 14.5 hours and an 8-rank one takes 15.75 hours. Surely, halo finding for billion particles should scale better than that.
Here is some timing info (I can provide a full log if you care)
4-rank run: P000 yt : [INFO ] 2018-03-11 17:31:25,531 Parameters: ... P000 yt : [INFO ] 2018-03-11 18:36:49,781 Initializing HOP [1h] P002 yt : [INFO ] 2018-03-11 22:14:15,486 Parsing outputs [3.5h] P000 yt : [INFO ] 2018-03-12 08:06:38,231 Saving halo ... [10h]
8-rank run: P000 yt : [INFO ] 2018-03-10 21:03:27,226 Parameters: ... P000 yt : [INFO ] 2018-03-10 21:43:52,543 Initializing HOP [0.75h] P005 yt : [INFO ] 2018-03-10 23:52:10,389 Parsing outputs [2h] P000 yt : [INFO ] 2018-03-11 12:43:46,645 Saving halo ... [12.5h *]
* - does not scale at all.
n
_______________________________________________ yt-users mailing list -- yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>> <mailto:yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>>> <mailto:yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>> <mailto:yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>>>> To unsubscribe send an email to yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>>> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>>>>
_______________________________________________ yt-users mailing list -- yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>> <mailto:yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>>> To unsubscribe send an email to yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>>>
_______________________________________________ yt-users mailing list -- yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>> To unsubscribe send an email to yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>>

Great! Let me know if you need any help with installation or pulling my changes. Britton On Tue, Mar 13, 2018 at 4:33 PM, Nick Gnedin <ngnedin@gmail.com> wrote:
Sure, I am happy to be a guinea pig.
I will need to install that clone on BW, so far I was using the system-wide version of yt.
On 3/13/2018 6:06 PM, Britton Smith wrote:
Hi Nick,
I just issued a pull request that should cut out the largest scaling bottleneck in the yt FOF/HOP halo finders. The PR can be found here: https://github.com/yt-project/yt/pull/1724
As noted in the description, the one change to the results is that the final halo catalog will no longer be sorted by mass when it's written. Would it be possible for you to test out this modification?
Britton
On Tue, Mar 13, 2018 at 3:11 PM, Britton Smith <brittonsmith@gmail.com <mailto:brittonsmith@gmail.com>> wrote:
Hi Nick,
Sorry, I've not seen that error before.
It looks like the very latest Rockstar supports ART natively, but it may have the same issue. That code is here: https://bitbucket.org/pbehroozi/rockstar-galaxies <https://bitbucket.org/pbehroozi/rockstar-galaxies>
Britton
On Tue, Mar 13, 2018 at 2:21 PM, Nick Gnedin <ngnedin@gmail.com <mailto:ngnedin@gmail.com>> wrote:
Britton,
That helps, and Rockstar starts and loads data, but now it gives an internal error:
P001 yt : [INFO ] 2018-03-13 16:03:11,005 Created 2048 chunks for ARTIO P001 yt : [WARNING ] 2018-03-13 16:05:09,506 Total Particle Count: 1.342e+08 [Error] Couldn't open # dsname index :strict! (Err: Servname not supported for ai_socktype)
That error comes from the function default_addrinfo(...) in rockstar source code, file inet/socket.c.
I don't know how well you guys know rockstar - let me know if you think this is the dead end.
n
On 03/13/2018 03:18 PM, Britton Smith wrote:
Hi Nick,
You might need to add the "num_readers" and "num_writers" keywords to this line: rhf = RockstarHaloFinder(d, num_readers=8, num_writers=8) The readers are the i/o nodes and the writers are the actual halo finding instances. 8 for each is a good place to start. When you run, you'll need to have (num_readers + num_writers + 1) MPI processes, where the extra one is for the server.
Hopefully, that takes care of it.
Britton
On Tue, Mar 13, 2018 at 12:43 PM, Nick Gnedin <ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>>> wrote:
Nathan,
Thank you. I followed the online instructions and now Rockstar starts, but gives me an error, irrespective of whether I call it directly or via a HaloCatalog object:
---------- code --------- d = yt.load(root+"/rei20_a"+aexp+"/rei20_a"+aexp+".art") rhf = RockstarHaloFinder(d) rhf.run() -------------------------
Traceback (most recent call last): File "hfc.py", line 15, in <module> rhf = RockstarHaloFinder(d) File "/scratch/midway2/gnedin/TMP/y t/yt/analysis_modules/halo_finding/rockstar/rockstar.py", line 230, in __init__ self.pool, self.workgroup = self.runner.setup_pool() File "/scratch/midway2/gnedin/TMP/y t/yt/analysis_modules/halo_finding/rockstar/rockstar.py", line 105, in setup_pool (self.num_writers, "writers") ] File "/scratch/midway2/gnedin/TMP/y t/yt/utilities/parallel_tools/parallel_analysis_interface.py", line 400, in from_sizes pool.add_workgroup(size, name = name) File "/scratch/midway2/gnedin/TMP/y t/yt/utilities/parallel_tools/parallel_analysis_interface.py", line 368, in add_workgroup group = self.comm.comm.Get_group().Incl(ranks) AttributeError: 'NoneType' object has no attribute 'Get_group'
On 03/13/2018 02:01 PM, Nathan Goldbaum wrote:
On Tue, Mar 13, 2018 at 1:52 PM, Nick Gnedin <ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>>>> wrote:
Britton,
I am trying to run Rockstar, but it does not seem to be packaged with yt by default, import fails:
Traceback (most recent call last): File "hfc.py", line 3, in <module> from yt.analysis_modules.halo_finding.rockstar.api import RockstarHaloFinder File "/home/gnedin/anaconda3/lib/py thon3.6/site-packages/yt/analysis_modules/halo_finding/rockstar/api.py", line 16, in <module> from .rockstar import RockstarHaloFinder File "/home/gnedin/anaconda3/lib/py thon3.6/site-packages/yt/analysis_modules/halo_finding/rocks tar/rockstar.py", line 28, in <module> from . import rockstar_interface ImportError: cannot import name 'rockstar_interface'
I think I am using the latest versions of python and yt:
yt module located at: /home/gnedin/anaconda3/lib/pyt hon3.6/site-packages
The current version of yt is:
--- Version = 3.4.1 ---
It is not packaged with yt out of the box. The easiest way to get a copy of yt built with the rockstar bindings is to install yt with the install script. You'll need to modify the script so that INST_ROCKSTAR=1 once you've downloaded it.
Altenatively you can manually build yt and rockstar following the instructions in the docs here:
http://yt-project.org/docs/dev/installing.html#installing- support-for-the-rockstar-halo-finder <http://yt-project.org/docs/dev/installing.html#installing- support-for-the-rockstar-halo-finder> <http://yt-project.org/docs/de v/installing.html#installing-support-for-the-rockstar-halo-finder <http://yt-project.org/docs/dev/installing.html#installing- support-for-the-rockstar-halo-finder>>
The rockstar halo finder is licensed under GPLv3 so unfortunately we can't distribute it with the yt binaries on pypi or conda-forge without changing their license as well.
n
On 03/13/2018 12:44 PM, Britton Smith wrote:
Hi Nick,
Thanks for your report. Your timing data confirms my suspicion about which part of the code isn't scaling. The rejoining of the halo list after the halo finder is run makes heavy use of MPI broadcast calls. Reworking this shouldn't be too difficult, just a question of someone finding the time. If anyone is interested in trying to fix this, I can direct them to the places that need the attention.
Nick, in the mean time, you might try the Rockstar halo finder (either the one built-in to yt or the standalone), which scales quite well. The output from both Rockstar versions is loadable with yt.
Britton
On Tue, Mar 13, 2018 at 8:46 AM, Nick Gnedin <ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>>> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>>
<mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com> <mailto:ngnedin@gmail.com <mailto:ngnedin@gmail.com>>>>>
wrote:
This is just a notice to the developer.
I have run a HOP halo finder for a large ART simulation (1024^3 particles) on BlueWaters with the following code:
import yt from yt.analysis_modules.halo_analysis.api import HaloCatalog yt.enable_parallelism() path = "/mnt/c/scratch/sciteam/ngnedin/PERM/B40/D" aexps = [ "0.1280", "0.1203", "0.1115", "0.1002", "0.0907" ] for aexp in aexps: d = yt.load(path+"/rei40_a"+aexp +"/rei40_a"+aexp+".art") hc =
HaloCatalog(data_ds=d,finder_method='hop',output_dir=path+"/ a="+aexp+"/hop",finder_kwargs={"dm_only":False,"ptype":"N-BODY"}) hc.create()
Because of memory constraints, I have to run it on at least 4 MPI ranks, and I noticed that yt implementation of HOP does not scale - a 4-rank job takes 14.5 hours and an 8-rank one takes 15.75 hours. Surely, halo finding for billion particles should scale better than that.
Here is some timing info (I can provide a full log if you care)
4-rank run: P000 yt : [INFO ] 2018-03-11 17:31:25,531 Parameters: ... P000 yt : [INFO ] 2018-03-11 18:36:49,781 Initializing HOP [1h] P002 yt : [INFO ] 2018-03-11 22:14:15,486 Parsing outputs [3.5h] P000 yt : [INFO ] 2018-03-12 08:06:38,231 Saving halo ... [10h]
8-rank run: P000 yt : [INFO ] 2018-03-10 21:03:27,226 Parameters: ... P000 yt : [INFO ] 2018-03-10 21:43:52,543 Initializing HOP [0.75h] P005 yt : [INFO ] 2018-03-10 23:52:10,389 Parsing outputs [2h] P000 yt : [INFO ] 2018-03-11 12:43:46,645 Saving halo ... [12.5h *]
* - does not scale at all.
n
_______________________________________________ yt-users mailing list -- yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>> <mailto:yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>>> <mailto:yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>> <mailto:yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>>>> To unsubscribe send an email to yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>>> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>>>>
_______________________________________________ yt-users mailing list -- yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>> <mailto:yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org>>> To unsubscribe send an email to yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>>>
_______________________________________________ yt-users mailing list -- yt-users@python.org <mailto:yt-users@python.org> <mailto:yt-users@python.org <mailto:yt-users@python.org
To unsubscribe send an email to yt-users-leave@python.org <mailto:yt-users-leave@python.org> <mailto:yt-users-leave@python.org <mailto:yt-users-leave@python.org>>
participants (2)
-
Britton Smith
-
Nick Gnedin