[Numpy-discussion] String manipulation

Christopher Barker Chris.Barker at noaa.gov
Tue Jul 21 16:37:37 EDT 2009


David Goldsmith wrote:
> Hi, Chris.  Look at this, _I'm_ answering one of _your_ questions
> (correctly, I hope):

maybe ;-)

> --- On Tue, 7/21/09, Christopher Barker <Chris.Barker at noaa.gov>
> wrote:
>> I don't see why:
>> 
>> np.array('a string', dtype='S1')
>> 
>> results in a length (1,) array, for instance.

> Well, as for why "[it's doing] its best to convert that scalar to a 
> length one string," that's because you used dtype='S1' instead of 
> dtype='S8':
> 
>>>> np.array('a string', dtype='S1')
> array('a', dtype='|S1')
>>>> np.array('a string', dtype='S8')
> array('a string', dtype='|S8')

sure, but what I wanted was an array of characters, that's why I did
'S1'. actually, I'd like to do one more:

In [27]: np.array('a string', dtype='S2')
Out[27]:
array('a ',
       dtype='|S2')

would yield:

array(['a ', 'st', 'ri', 'ng'],
       dtype='|S2')


instead.

> but as for shape, I can't reproduce your result at all:
> 
>>>> np.array('a string', dtype='S1').shape
> ()

that's 'cause I lied -- it's a numpy scalar (is that right?), not a
length-1 array, just like if you do:

np.array(5, dtype=np.int)


> I see your point (global consistency) but personally, IMO, if you
> want that kind of string behavior, work in the Python namesapace: 

sure, but I'd like numpy to see wider use beyond number crunching -- a 
homogeneous mutable n-d array is useful for a lot of things, like this.

But your post gave me an idea, one could do:

In [35]: line
Out[35]: '-1.000000E+00-1.000000E+00-1.000000E+00-1.000000E+00 
1.250000E+00 1.250000E+00'

In [36]: a = np.array(line)

In [37]: a
Out[37]:
array('-1.000000E+00-1.000000E+00-1.000000E+00-1.000000E+00 1.250000E+00 
1.250000E+00',
       dtype='|S78')

but now when I try to split it up:

In [38]: a = a.view(dtype='S13')

I get:

ValueError: new type not compatible with array.

same with 'S1':

In [39]: a = a.view(dtype='S1')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)


Shouldn't that work?

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov



More information about the NumPy-Discussion mailing list