Mailman 3 loadtxt() behavior on single-line files - NumPy-Discussion

24 Jun 2010

      Hi,

I was having the hardest time trying to figure out an intermittent bug in
one of my programs.  Essentially, in some situations, it was throwing an
error saying that the array object was not an array.  It took me a while,
but then I figured out that my program was assuming that the object returned
from a loadtxt() call was always a structured array (I was using dtypes).
However, if the data file being loaded only had one data record, then all
you get back is a structured record.

import numpy as np
from StringIO import StringIO

strData = StringIO("89.23 47.2\n13.2 42.2")
a = np.loadtxt(strData, dtype=[('x', float), ('y', float)])
print "Length Two"
print a
print a.shape
print len(a)

strData = StringIO("53.2 49.2")
a = np.loadtxt(strData, dtype=[('x', float), ('y', float)])
print "\n\nLength One"
print a
print a.shape
try :
    print len(a)
except TypeError as err
    print "ERROR:", err

Which gets me this output:

Length Two
[(89.230000000000004, 47.200000000000003)
 (13.199999999999999, 42.200000000000003)]
(2,)
2

Length One
(53.200000000000003, 49.200000000000003)
()
ERROR: len() of unsized object

Note that this isn't restricted to structured arrays.  For regular ndarrays,
loadtxt() appears to mimic the behavior of np.squeeze():
...
...
...
a = np.ones((1, 1, 1))
np.squeeze(a)[0]
IndexError: 0-d arrays can't be indexed
...
...
...
strData = StringIO("53.2")
a = np.loadtxt(strData)
a[0]
IndexError: 0-d arrays can't be indexed
So, if you have multiple lines with multiple columns, you get a 2-D array,
as expected.
if you have a single line of data with multiple columns, you get a 1-D
array.
If you have a single column with many lines, you also get a 1-D array (which
is probably expected, I guess).
If you have a single column with a single line, you get a scalar (actually,
a 0-D array).

Is this a bug or a feature?  I can see the advantages of having loadtxt()
returning the lowest # of dimensions that can hold the given data, but it
leaves the code vulnerable to certain edge cases.  Maybe there is a
different way I should be doing this, but I feel that this behavior at the
very least should be included in the loadtxt documentation.

Ben Root

loadtxt() behavior on single-line files

Benjamin Root

Warren Weckesser

Christopher Barker

Benjamin Root

Benjamin Root

tags

participants (3)