On 4. mai 2011, at 17.34, Derek Homeier wrote:
Hi Paul,
I've got back to your suggestion re. the ndmin flag for loadtxt from a few weeks ago...
On 27.03.2011, at 12:09PM, Paul Anton Letnes wrote:
1562: I attach a possible patch. This could also be the default behavior to my mind, since the function caller can simply call numpy.squeeze if needed. Changing default behavior would probably break old code, however.
See comments on Trac as well.
Your patch is better, but there is one thing I disagree with. 808 if X.ndim < ndmin: 809 if ndmin == 1: 810 X.shape = (X.size, ) 811 elif ndmin == 2: 812 X.shape = (X.size, 1) The last line should be: 812 X.shape = (1, X.size) If someone wants a 2D array out, they would most likely expect a one-row file to come out as a one-row array, not the other way around. IMHO.
I think you are completely right for the test case with one row. More generally though, since a file of N rows and M columns is read into an array of shape (N, M), ndmin=2 should enforce X.shape = (1, X.size) for single-row input, and X.shape = (X.size, 1) for single-column input. I thought this would be handled automatically by preserving the original 2 dimensions, but apparently with single-row/multi-column input an extra dimension 1 is prepended when the array is returned from the parser. I've put up a fix for this at
https://github.com/dhomeier/numpy/compare/master...ndmin-cols
and also tested the patch against 1.6.0.rc2.
Cheers, Derek
Looks sensible to me at least! But: Isn't the numpy.atleast_2d and numpy.atleast_1d functions written for this? Shouldn't we reuse them? Perhaps it's overkill, and perhaps it will reintroduce the 'transposed' problem? Paul