On 31 May 2011, at 21:28, Pierre GM wrote:
On May 31, 2011, at 6:37 PM, Derek Homeier wrote:
On 31 May 2011, at 18:25, Pierre GM wrote:
On May 31, 2011, at 5:52 PM, Derek Homeier wrote:
I think stuff like multiple delimiters should have been dealt with before, as the right place to insert the ndmin code (which includes the decision to squeeze or not to squeeze as well as to add additional dimensions, if required) would be right at the end before the 'unpack' switch, or rather replacing the bit:
if usemask: output = output.view(MaskedArray) output._mask = outputmask if unpack: return output.squeeze().T return output.squeeze()
But there it's already not clear to me how to deal with the MaskedArray case...
Oh, easy. You need to replace only the last three lines of genfromtxt with the ones from loadtxt (808-833). Then, if usemask is True, you need to use ma.atleast_Xd instead of np.atleast_Xd. Et voilĂ . Comments: * I would raise an exception if ndmin isn't correct *before* trying to read the file... * You could define a `collapse_function` that would be `np.atleast_1d`, `np.atleast_2d`, `ma.atleast_1d`... depending on the values of `usemask` and `ndmin`...
Thanks, that helped to clean up a little bit.
If you have any question about numpy.ma, don't hesitate to contact me directly.
Thanks for the directions! I was not sure about the usemask case because it presently does not invoke .squeeze() either...
The idea is that if `usemask` is True, you build a second array (the mask), that you attach to your main array at the very end (in the `output=output.view(MaskedArray), output._mask = mask` combo...). Afterwards, it's a regular MaskedArray that supports the .squeeze() method...
OK, in both cases output.squeeze() is now used if ndim>ndmin and usemask is False - at least it does not break any tests, so it seems to work with MaskedArrays as well.
On a possibly related note, genfromtxt also treats the 'unpack'ing of structured arrays differently from loadtxt (which returns a list of arrays in that case) - do you know if this is on purpose, or also rather missing functionality (I guess it might break recfromtxt()...)?
Keep in mind that I haven't touched genfromtxt since 8-10 months or so. I wouldn't be surprised that it were lagging a bit behind loadtxt in terms of development. Yes, there'll be some tweaking to do for recfromtxt (it's OK for now if `ndmin` and `unpack` are the defaults) and others, but nothing major.
Well, at long last I got to implement the above and added the corresponding tests for genfromtxt - with the exception of the dimension-0 cases, since genfromtxt raises an error on empty files. There already is a comment it should perhaps better return an empty array, so I am putting that idea up for discussion here again. I tried to devise a very basic test with masked arrays, just added to test_withmissing now. I also implemented the same unpacking behaviour for structured arrays and just made recfromtxt set unpack=False to work (or should it issue a warning?). The patches are up for review as commit 8ac01636 in my iocnv-wildcard branch: https://github.com/dhomeier/numpy/compare/master...iocnv-wildcard Cheers, Derek