Re: [yt-users] Problems reading GADGET 2 Binary datafile using yt

22 Feb 2017

      Yes this is extremely useful. Thanks a lot for taking time to write me in
detail about what went wrong. Just for info (in case its helpful) I would
like to say that there's an open source module available called
pygadgetreader which was able to successfully read all the available data
in the Gadget 2 file I have shared.

Cheers,
Alankar

On Feb 23, 2017 12:42 AM, "Nathan Goldbaum"  wrote:
...
On Wed, Feb 22, 2017 at 12:17 PM, Alankar Dutta 
wrote:
...
Hello,
The entire snapshot file is huge (around 1 TB) so I am attaching only the
first part of it as of now via Google Drive. Hope this helps. I can share
it via other means if necessary.
Hmm, there seems to be an issue when yt tries to parse the header and
validate that everything is consistent.
You can see the affected code block in yt here:
https://bitbucket.org/yt_analysis/yt/src/abf5a8eff1b2d0cd776a41a33e5d3f
3d25232ecc/yt/frontends/gadget/data_structures.py?at=
yt&fileviewer=file-view-default#data_structures.py-340
This function is called (indirectly) by yt.load() to verify that a given
file is *really* a Gadget binary file. When I follow along with the
execution of that function for the file you sent me, I end up getting that
np0 20736113 while np1 is 41472226.0 (i.e. exactly twice np0). The first
integer, np0, is read in directly from the header of the binary file and
corresponds to the number of particles in the dataset as written out to the
header by Gadget. The second, np1. is the number of particles in the
dataset inferred by reading in the size of the position block. The fact
that inferring the particle count with the size of the position block ends
up with exactly twice the number of particles as we expect probably
indicates that there's a single/double precision issue. In fact, it seems
that we expect each position entry to require 4 bytes (e.g. 32 bit, or
single precision), so I infer that your file contains double precision
positions (e.g. 8 bytes per particle).
So, all that to say, it looks like we would need to patch the Gadget
frontend to support your output type, which seems to have double precision
fields, which is a little bit different from the other Gadget binary
outputs we've seen in the past. These sorts of issues with Gadget are
unfortunately somewhat common due to the fragmentation in the Gadget
ecosystem, with many research groups maintaining mutually incompatible
versions of Gadget.
Note that if you look at page 32 in the Gadget user guide (
https://wwwmpa.mpa-garching.mpg.de/gadget/users-guide.pdf), this *is* a
bit different from the output format documented there, which specifies
single precision positions.
To add support for this output type we'd need to start with an example
smallish (<5 GB) dataset in this format that we can add as a public test
dataset on yt-project.org/data. Once that's available, we can patch the
Gadget frontend to support this output type.
Finally, you mentioned that your dataset is pretty large (~1 TB).
Unfortunately, yt will currently have trouble scaling to datasets that
large. Right now yt will require substantial amounts of RAM to index
datasets larger than about 1024^3 particles, since yt makes use of a global
octree for indexing and managing I/O chunking. With a dataset so large, the
octree index requires a substantial amount of RAM.
I am currently actively working on improving yt's scaling for large
particle datasets. This is a major development effort that will likely be
included in either the next major release of yt or the one after that.
Unfortunately I think you will have lots of issues trying to get yt's
current particle support to work well with as big of a dataset as you need
to work with and you will likely need to wait until the development effort
I'm working on is publicly available. I'd encourage you to sign up to the
yt-dev mailing list if you want to hear more about this effort. I will be
sharing a design document there describing the changes to yt that will be
necessary to improve scaling for particle data in the next week or two.
I hope that's helpful,
Nathan Goldbaum
...
Cheers,
Alankar
 snapshot_068.0
https://drive.google.com/file/d/0B6IIQdUdRX9UN3dWRkdLWWxjNjg/view?usp=drive_...

On Wed, Feb 22, 2017 at 10:37 PM, Nathan Goldbaum 
wrote:
...
We should have support for outputs in the SnapFormat=1 output format in
the latest release of yt. If you're not using the latest version of yt,
please try updating. If it's a multi-file dataset, you should load the 0th
file.
If that is not working it would help to debug the issue if you can share
an output file that isn't loading. The easiest way to share an ouput is to
use the yt curldrop:
https://docs.hub.yt/services.html#curldrop
If you're not comfortable sharing the file publicly you can mail me
off-list with the link to the output file.
Hope that helps,
Nathan
On Wed, Feb 22, 2017 at 10:49 AM, Alankar Dutta 
...
wrote:
...
Hello YT-community,
This outputs are created with SnapFormat parameter set to 1. This is
requirement mentioned in the users guideline of yt.
Cheers,
Alankar
On Wed, Feb 22, 2017 at 10:02 PM, Alankar Dutta <
dutta.alankar@gmail.com> wrote:
...
Hello YT-community,
I have been trying to use yt for analysis of the output from a GADGET
2 simulation stored as an Unformatted Fortran Binary. It consists of files
named as snapshot_068 which is divided into 1024 subfiles named as
snapshot_068.0, snapshot_068.1 and so on. Whenever I am loading this with
yt I am getting the following error message and I have got no idea as to
how to fix this. I have also tried reading only one subfile of this multi
part snapshot but with no success. I am relying on the community to help me
in this regard.
My code:
fname = 'snapdir_068/snapshot_068'
ds = yt.load(fname)
Error displayed:
yt : [ERROR    ] 2017-02-22 21:55:34,587 None of the arguments
provided to load() is a valid file
yt : [ERROR    ] 2017-02-22 21:55:34,587 Please check that you have
used a correct path
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/alankar/anaconda3/lib/python3.5/site-packages/yt/convenience.py",
line 76, in load
    raise YTOutputNotIdentified(args, kwargs)
yt.utilities.exceptions.YTOutputNotIdentified: Supplied
('snapshot_068',) {}, but could not load!
#Trying to read only one of the multi part file
My code:
fname = 'snapdir_068/snapshot_068.0'
ds = yt.load(fname)
Error displayed:
yt : [ERROR    ] 2017-02-22 21:57:17,625 Couldn't figure out output
type for /media/alankar/Seagate Expansion Drive/mb2/snapshots/snapdir_06
8/snapshot_068.0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/alankar/anaconda3/lib/python3.5/site-packages/yt/convenience.py",
line 98, in load
    raise YTOutputNotIdentified(args, kwargs)
yt.utilities.exceptions.YTOutputNotIdentified: Supplied
('snapshot_068.0',) {}, but could not load!
Cheers,
Alankar
_______________________________________________
yt-users mailing list
yt-users@lists.spacepope.org
http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________
yt-users mailing list
yt-users@lists.spacepope.org
http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________
yt-users mailing list
yt-users@lists.spacepope.org
http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________
yt-users mailing list
yt-users@lists.spacepope.org
http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org