dtype 'S0' not understood
Howdy, It seems it's possible using e.g. In [25]: dtype([('foo', str)])Out[25]: dtype([('foo', 'S0')]) to get yourself a zerolength string. However dtype('S0') results in a TypeError: data type not understood. I understand the stupidity of creating a 0length string field but it's conceivable that it's accidental. For example, it could lead to a situation where you've created that field, are missing all the data you had meant to put in it, serialize with np.save, and upon np.load aren't able to get _any_ of your data back because the dtype descriptor is considered bogus (can you guess why I thought of this scenario?). It seems that either dtype(str) should do something more sensible than zerolength string, or it should be possible to create it with dtype(' S0'). Which should it be? David
On 23Sep09, at 7:55 PM, David WardeFarley wrote:
It seems that either dtype(str) should do something more sensible than zerolength string, or it should be possible to create it with dtype(' S0'). Which should it be?
Since there wasn't any response I went ahead and fixed it by making str and unicode dtypes allow a size of 0 when constructed with protocol type codes. Either S0 and U0 should be constructable with typecodes or they shouldn't be allowed at all; I opted for the latter since a) it was simple and b) I don't know what a sensible default for dtype(str) would be (length 1? length 10?). Patch is at: http://projects.scipy.org/numpy/ticket/1239 Review away! David
This seems to be a old problem but I've recently hit with this in a very random way (I'm using numpy 1.6.1). There seems to be a ticket (1239) but it seems the issue is unscheduled. Can somebody tell me if this is fixed? In particular, it makes for a very unstable behavior when you try to reference something from a string array and pickle it across the wire. For example: In [1]: import numpy In [2]: a = numpy.array(['a', '', 'b']) In [3]: import cPickle In [4]: s = cPickle.dumps(a[1]) In [5]: cPickle.loads(s)  TypeError Traceback (most recent call last) /auto/cnvtvws/wlee/fstrat/src/<ipythoninput5555fae2bd4f5> in <module>() > 1 cPickle.loads(s) TypeError: ('data type not understood', <type 'numpy.dtype'>, ('S0', 0, 1)) Note that if you reference a[0] and a[2], it would work, so you're in the case where sometimes it'd work but sometimes it won't. Checking for this case in the code and work around it would really be a pain. Thanks! Will On Thu, Sep 24, 2009 at 7:03 PM, David WardeFarley <dwf@cs.toronto.edu>wrote:
On 23Sep09, at 7:55 PM, David WardeFarley wrote:
It seems that either dtype(str) should do something more sensible than zerolength string, or it should be possible to create it with dtype(' S0'). Which should it be?
Since there wasn't any response I went ahead and fixed it by making str and unicode dtypes allow a size of 0 when constructed with protocol type codes. Either S0 and U0 should be constructable with typecodes or they shouldn't be allowed at all; I opted for the latter since a) it was simple and b) I don't know what a sensible default for dtype(str) would be (length 1? length 10?).
Patch is at:
http://projects.scipy.org/numpy/ticket/1239
Review away!
David _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
On Wed, Oct 3, 2012 at 1:58 PM, Will Lee <lee.will@gmail.com> wrote:
This seems to be a old problem but I've recently hit with this in a very random way (I'm using numpy 1.6.1). There seems to be a ticket (1239) but it seems the issue is unscheduled. Can somebody tell me if this is fixed?
In particular, it makes for a very unstable behavior when you try to reference something from a string array and pickle it across the wire. For example:
In [1]: import numpy In [2]: a = numpy.array(['a', '', 'b']) In [3]: import cPickle In [4]: s = cPickle.dumps(a[1]) In [5]: cPickle.loads(s)  TypeError Traceback (most recent call last) /auto/cnvtvws/wlee/fstrat/src/<ipythoninput5555fae2bd4f5> in <module>() > 1 cPickle.loads(s) TypeError: ('data type not understood', <type 'numpy.dtype'>, ('S0', 0, 1))
Note that if you reference a[0] and a[2], it would work, so you're in the case where sometimes it'd work but sometimes it won't. Checking for this case in the code and work around it would really be a pain.
Hi Will, Yes, this has been waiting on the bug tracker for a long while. I'll resubmit my patch as a pull request to see if we can't get this fixed up soon... David
participants (3)

David WardeFarley

David WardeFarley

Will Lee