Yes this is extremely useful. Thanks a lot for taking time to write me in
detail about what went wrong. Just for info (in case its helpful) I would
like to say that there's an open source module available called
pygadgetreader which was able to successfully read all the available data
in the Gadget 2 file I have shared.
Cheers,
Alankar
On Feb 23, 2017 12:42 AM, "Nathan Goldbaum"
On Wed, Feb 22, 2017 at 12:17 PM, Alankar Dutta
wrote: Hello,
The entire snapshot file is huge (around 1 TB) so I am attaching only the first part of it as of now via Google Drive. Hope this helps. I can share it via other means if necessary.
Hmm, there seems to be an issue when yt tries to parse the header and validate that everything is consistent.
You can see the affected code block in yt here:
https://bitbucket.org/yt_analysis/yt/src/abf5a8eff1b2d0cd776a41a33e5d3f 3d25232ecc/yt/frontends/gadget/data_structures.py?at= yt&fileviewer=file-view-default#data_structures.py-340
This function is called (indirectly) by yt.load() to verify that a given file is *really* a Gadget binary file. When I follow along with the execution of that function for the file you sent me, I end up getting that np0 20736113 while np1 is 41472226.0 (i.e. exactly twice np0). The first integer, np0, is read in directly from the header of the binary file and corresponds to the number of particles in the dataset as written out to the header by Gadget. The second, np1. is the number of particles in the dataset inferred by reading in the size of the position block. The fact that inferring the particle count with the size of the position block ends up with exactly twice the number of particles as we expect probably indicates that there's a single/double precision issue. In fact, it seems that we expect each position entry to require 4 bytes (e.g. 32 bit, or single precision), so I infer that your file contains double precision positions (e.g. 8 bytes per particle).
So, all that to say, it looks like we would need to patch the Gadget frontend to support your output type, which seems to have double precision fields, which is a little bit different from the other Gadget binary outputs we've seen in the past. These sorts of issues with Gadget are unfortunately somewhat common due to the fragmentation in the Gadget ecosystem, with many research groups maintaining mutually incompatible versions of Gadget.
Note that if you look at page 32 in the Gadget user guide ( https://wwwmpa.mpa-garching.mpg.de/gadget/users-guide.pdf), this *is* a bit different from the output format documented there, which specifies single precision positions.
To add support for this output type we'd need to start with an example smallish (<5 GB) dataset in this format that we can add as a public test dataset on yt-project.org/data. Once that's available, we can patch the Gadget frontend to support this output type.
Finally, you mentioned that your dataset is pretty large (~1 TB). Unfortunately, yt will currently have trouble scaling to datasets that large. Right now yt will require substantial amounts of RAM to index datasets larger than about 1024^3 particles, since yt makes use of a global octree for indexing and managing I/O chunking. With a dataset so large, the octree index requires a substantial amount of RAM.
I am currently actively working on improving yt's scaling for large particle datasets. This is a major development effort that will likely be included in either the next major release of yt or the one after that. Unfortunately I think you will have lots of issues trying to get yt's current particle support to work well with as big of a dataset as you need to work with and you will likely need to wait until the development effort I'm working on is publicly available. I'd encourage you to sign up to the yt-dev mailing list if you want to hear more about this effort. I will be sharing a design document there describing the changes to yt that will be necessary to improve scaling for particle data in the next week or two.
I hope that's helpful,
Nathan Goldbaum
Cheers, Alankar snapshot_068.0 https://drive.google.com/file/d/0B6IIQdUdRX9UN3dWRkdLWWxjNjg/view?usp=drive_...
On Wed, Feb 22, 2017 at 10:37 PM, Nathan Goldbaum
wrote: We should have support for outputs in the SnapFormat=1 output format in the latest release of yt. If you're not using the latest version of yt, please try updating. If it's a multi-file dataset, you should load the 0th file.
If that is not working it would help to debug the issue if you can share an output file that isn't loading. The easiest way to share an ouput is to use the yt curldrop:
https://docs.hub.yt/services.html#curldrop
If you're not comfortable sharing the file publicly you can mail me off-list with the link to the output file.
Hope that helps,
Nathan
On Wed, Feb 22, 2017 at 10:49 AM, Alankar Dutta
wrote:
Hello YT-community,
This outputs are created with SnapFormat parameter set to 1. This is requirement mentioned in the users guideline of yt.
Cheers, Alankar
On Wed, Feb 22, 2017 at 10:02 PM, Alankar Dutta < dutta.alankar@gmail.com> wrote:
Hello YT-community,
I have been trying to use yt for analysis of the output from a GADGET 2 simulation stored as an Unformatted Fortran Binary. It consists of files named as snapshot_068 which is divided into 1024 subfiles named as snapshot_068.0, snapshot_068.1 and so on. Whenever I am loading this with yt I am getting the following error message and I have got no idea as to how to fix this. I have also tried reading only one subfile of this multi part snapshot but with no success. I am relying on the community to help me in this regard.
My code:
fname = 'snapdir_068/snapshot_068' ds = yt.load(fname)
Error displayed:
yt : [ERROR ] 2017-02-22 21:55:34,587 None of the arguments provided to load() is a valid file yt : [ERROR ] 2017-02-22 21:55:34,587 Please check that you have used a correct path Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/alankar/anaconda3/lib/python3.5/site-packages/yt/convenience.py", line 76, in load raise YTOutputNotIdentified(args, kwargs) yt.utilities.exceptions.YTOutputNotIdentified: Supplied ('snapshot_068',) {}, but could not load!
#Trying to read only one of the multi part file My code:
fname = 'snapdir_068/snapshot_068.0' ds = yt.load(fname)
Error displayed:
yt : [ERROR ] 2017-02-22 21:57:17,625 Couldn't figure out output type for /media/alankar/Seagate Expansion Drive/mb2/snapshots/snapdir_06 8/snapshot_068.0 Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/alankar/anaconda3/lib/python3.5/site-packages/yt/convenience.py", line 98, in load raise YTOutputNotIdentified(args, kwargs) yt.utilities.exceptions.YTOutputNotIdentified: Supplied ('snapshot_068.0',) {}, but could not load!
Cheers, Alankar
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org