[Numpy-discussion] A bug in loadtxt and how to convert a string array (hex data) to decimal?

Fri Sep 19 04:52:01 EDT 2008

2008/9/18 Ryan May <rmay31 at gmail.com>:
> It's because of how numpy handles strings arrays (which I admit I don't
> understand very well.)  Basically, it's converting the numbers properly,
> but truncating them to 3 characters.  Try this, which just forces it to
> expand to strings 4 characters wide:
>                                        test=loadtxt('test.txt',comments='"',dtype='|S4',converters={0:lambda
>        s:int(s,16)})

Here's what happens in the background:

>>> data = [(1023, '3fE'), (1007, '3e8'), (991, '3d9'), (975, '3c7')]
>>> np.array(data, np.dtype('string'))
array([['102', '3fE'],
       ['100', '3e8'],
       ['991', '3d9'],
       ['975', '3c7']],
      dtype='|S3')

Why?  Because

>>> np.dtype('string')
dtype('|S0')

So, it grabs the width from the first string it sees.

A clean workaround then:

test = np.loadtxt('/tmp/data.txt', comments='"', dtype='string',
converters={0:lambda s: str(int(s,16))})

Cheers
Stéfan