dtype '|S0' not understood
![](https://secure.gravatar.com/avatar/b684c02bab6c8d54c0c25c4b69ee1135.jpg?s=120&d=mm&r=g)
Howdy, It seems it's possible using e.g. In [25]: dtype([('foo', str)])Out[25]: dtype([('foo', '|S0')]) to get yourself a zero-length string. However dtype('|S0') results in a TypeError: data type not understood. I understand the stupidity of creating a 0-length string field but it's conceivable that it's accidental. For example, it could lead to a situation where you've created that field, are missing all the data you had meant to put in it, serialize with np.save, and upon np.load aren't able to get _any_ of your data back because the dtype descriptor is considered bogus (can you guess why I thought of this scenario?). It seems that either dtype(str) should do something more sensible than zero-length string, or it should be possible to create it with dtype('| S0'). Which should it be? David
![](https://secure.gravatar.com/avatar/b684c02bab6c8d54c0c25c4b69ee1135.jpg?s=120&d=mm&r=g)
On 23-Sep-09, at 7:55 PM, David Warde-Farley wrote:
Since there wasn't any response I went ahead and fixed it by making str and unicode dtypes allow a size of 0 when constructed with protocol type codes. Either S0 and U0 should be constructable with typecodes or they shouldn't be allowed at all; I opted for the latter since a) it was simple and b) I don't know what a sensible default for dtype(str) would be (length 1? length 10?). Patch is at: http://projects.scipy.org/numpy/ticket/1239 Review away! David
![](https://secure.gravatar.com/avatar/b51d37013eeed6ea444db824ef2a4ed1.jpg?s=120&d=mm&r=g)
This seems to be a old problem but I've recently hit with this in a very random way (I'm using numpy 1.6.1). There seems to be a ticket (1239) but it seems the issue is unscheduled. Can somebody tell me if this is fixed? In particular, it makes for a very unstable behavior when you try to reference something from a string array and pickle it across the wire. For example: In [1]: import numpy In [2]: a = numpy.array(['a', '', 'b']) In [3]: import cPickle In [4]: s = cPickle.dumps(a[1]) In [5]: cPickle.loads(s) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /auto/cnvtvws/wlee/fstrat/src/<ipython-input-5-555fae2bd4f5> in <module>() ----> 1 cPickle.loads(s) TypeError: ('data type not understood', <type 'numpy.dtype'>, ('S0', 0, 1)) Note that if you reference a[0] and a[2], it would work, so you're in the case where sometimes it'd work but sometimes it won't. Checking for this case in the code and work around it would really be a pain. Thanks! Will On Thu, Sep 24, 2009 at 7:03 PM, David Warde-Farley <dwf@cs.toronto.edu>wrote:
![](https://secure.gravatar.com/avatar/b684c02bab6c8d54c0c25c4b69ee1135.jpg?s=120&d=mm&r=g)
On 23-Sep-09, at 7:55 PM, David Warde-Farley wrote:
Since there wasn't any response I went ahead and fixed it by making str and unicode dtypes allow a size of 0 when constructed with protocol type codes. Either S0 and U0 should be constructable with typecodes or they shouldn't be allowed at all; I opted for the latter since a) it was simple and b) I don't know what a sensible default for dtype(str) would be (length 1? length 10?). Patch is at: http://projects.scipy.org/numpy/ticket/1239 Review away! David
![](https://secure.gravatar.com/avatar/b51d37013eeed6ea444db824ef2a4ed1.jpg?s=120&d=mm&r=g)
This seems to be a old problem but I've recently hit with this in a very random way (I'm using numpy 1.6.1). There seems to be a ticket (1239) but it seems the issue is unscheduled. Can somebody tell me if this is fixed? In particular, it makes for a very unstable behavior when you try to reference something from a string array and pickle it across the wire. For example: In [1]: import numpy In [2]: a = numpy.array(['a', '', 'b']) In [3]: import cPickle In [4]: s = cPickle.dumps(a[1]) In [5]: cPickle.loads(s) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /auto/cnvtvws/wlee/fstrat/src/<ipython-input-5-555fae2bd4f5> in <module>() ----> 1 cPickle.loads(s) TypeError: ('data type not understood', <type 'numpy.dtype'>, ('S0', 0, 1)) Note that if you reference a[0] and a[2], it would work, so you're in the case where sometimes it'd work but sometimes it won't. Checking for this case in the code and work around it would really be a pain. Thanks! Will On Thu, Sep 24, 2009 at 7:03 PM, David Warde-Farley <dwf@cs.toronto.edu>wrote:
participants (3)
-
David Warde-Farley
-
David Warde-Farley
-
Will Lee