[Numpy-discussion] Question about dtype

Valentin Haenel valentin at haenel.co
Fri Dec 26 20:21:40 EST 2014


* Nathaniel Smith <njs at pobox.com> [2014-12-13]:
[snip]
> Ah, so your question is about how to serialize dtypes.
> 
> The simplest approach would be to use pickle and shove the resulting string
> into your json. However, this is very dangerous if you need to process
> untrusted files, because if I can convince you to unpickle an arbitrary
> string, then I can run arbitrary code on your computer.
> 
> I believe .npy file format has a safe method for (un)serializing drypes.
> I'd look up what it does.

Just to follow this up:

NPY actually does some magic to differntiate between simple and complex
dtypes (I had already discovered this and am doing it too):

https://github.com/numpy/numpy/blob/master/numpy/lib/format.py#L210

And then it does a ``repr`` on the result:

https://github.com/numpy/numpy/blob/master/numpy/lib/format.py#L290

On loading it does a ``(safe_)eval`` on the whole header dict:

https://github.com/numpy/numpy/blob/master/numpy/lib/format.py#L479

I can do that too and it is what was suggested later on in this thread,
but simple dtypes cause a SyntaxError. So what I'll do is try to
safe_eval the string, catch the SyntaxError and just use the plain
string in that case. That should be easier than trying to reassmble the
correct thing from the deserialzed JSON.

best wishes and thanks for the advice!

V-



More information about the NumPy-Discussion mailing list