Re: [Numpy-discussion] loadtxt ndmin option

5 May 2011

      On 4. mai 2011, at 17.34, Derek Homeier wrote:
...
Hi Paul,
I've got back to your suggestion re. the ndmin flag for loadtxt from a few weeks ago...
On 27.03.2011, at 12:09PM, Paul Anton Letnes wrote:
...
...
...
1562:
I attach a possible patch. This could also be the default  
behavior to my mind, since the function caller can simply call  
numpy.squeeze if needed. Changing default behavior would probably  
break old code, however.
See comments on Trac as well.
Your patch is better, but there is one thing I disagree with.
808    if X.ndim < ndmin:
809        if ndmin == 1:
810            X.shape = (X.size, )
811        elif ndmin == 2:
812            X.shape = (X.size, 1) 
The last line should be:
812            X.shape = (1, X.size) 
If someone wants a 2D array out, they would most likely expect a one-row file to come out as a one-row array, not the other way around. IMHO.
I think you are completely right for the test case with one row. More generally though, 
since a file of N rows and M columns is read into an array of shape (N, M), ndmin=2 
should enforce X.shape = (1, X.size) for single-row input, and X.shape = (X.size, 1) 
for single-column input.
I thought this would be handled automatically by preserving the original 2 dimensions, 
but apparently with single-row/multi-column input an extra dimension 1 is prepended 
when the array is returned from the parser. I've put up a fix for this at
https://github.com/dhomeier/numpy/compare/master...ndmin-cols
and also tested the patch against 1.6.0.rc2.
Cheers,
      				Derek
Looks sensible to me at least!

But: Isn't the numpy.atleast_2d and numpy.atleast_1d functions written for this? Shouldn't we reuse them? Perhaps it's overkill, and perhaps it will reintroduce the 'transposed' problem?

Paul