On Mon, Feb 28, 2011 at 11:45 AM, Bruce Southey <bsouthey@gmail.com> wrote:

On 02/28/2011 09:47 AM, Benjamin Root wrote:

On Mon, Feb 28, 2011 at 9:25 AM, Bruce Southey <bsouthey@gmail.com> wrote:

On 02/28/2011 09:02 AM, Benjamin Root wrote:
[snip]

>
>
> So, is there still no hope in addressing this old bug report of mine?
>
> http://projects.scipy.org/numpy/ticket/1562
>
> Ben Root
>

I think you need to add more details to this. So do you have an example
of the problem that includes code and expected output?

Perhaps genfromtxt is probably more appropriate than loadtxt for what
you want:

from StringIO import StringIO
import numpy as np
t = StringIO("1,1.3,abcde\n2,2.3,wxyz\n1\n3,3.3,mnop")
data = np.genfromtxt(t,
[('myint','i8'),('myfloat','f8'),('mystring','S5')], names =
['myint','myfloat','mystring'], delimiter=",", invalid_raise=False)
print 'Bad data raise\n',data

This gives the output that skips the incomplete 3rd line:

/usr/lib64/python2.7/site-packages/numpy/lib/npyio.py:1507:
ConversionWarning: Some errors were detected !
Line #3 (got 1 columns instead of 3)
warnings.warn(errmsg, ConversionWarning)
Bad data raise
[(1, 1.3, 'abcde') (2, 2.3, 'wxyz') (3, 3.3, 'mnop')]

Bruce

Bruce,

I think you mis-understood the problem I was reporting.

Probably - which is why I asked for more details.

You can find the discussion thread here:

http://www.mail-archive.com/numpy-discussion@scipy.org/msg26235.html

I have proposed that at the very least, an example of this problem is added to the documentation of loadtxt so that users know to be aware of this possibility.

I did not connect the ticket to that email thread. Removing the structured array part of your email, I think essentially the argument is which should be the output of:
np.loadtxt(StringIO("89.23"))
np.arange(5)[1]

These return an 0-d array and an rather old argument about that (which may address the other part of the ticket). Really I see this behavior as standard so you add an example to the documentation to reflect that.

I agree that this behavior has become standard, and, by-and-large, desirable. It just comes with this sneaky pitfall when encountering single-line files. Therefore, I have a couple of suggestions that I would find suitable for resolution of this report. I will leave it up to the developers to decide which course to pursue.

1. Add a "mindims" parameter that would default to None (for current behavior). The caller can specify the minimum number of dimensions the resulting array should have and then call some sort of function like np.atleast_nd() (I know it doesn't exists, but such a function might be useful). The documentation for this keyword param would allude to the rational for its use.

2. Keep the current behavior, but possibly not for when a dtype is specified. Given that the squeeze() was meant for addressing the situation where the data structure is not known a priori, squeezing a known dtype seems to go against this rationale.

3. Keep the current behavior, but add some documentation for loadtxt() that illustrates the problem and shows the usage of a function like np.atleast_2d(). I would be willing to write up such an example.

In addition, loadtxt fails on empty files even when provided with a dtype. I believe genfromtxt also fails as well in this case.

Ben Root

Errors on empty files probably should be a new bug report as that was not in the ticket.

Done: http://projects.scipy.org/numpy/ticket/1752

Thanks,
Ben Root