> On 31 Mar 2011, at 17:03, Bruce
Southey wrote:
>
>> This is an invalid ticket because the
docstring clearly states that in
>> 3 different, yet critical places,
that missing values are not handled
>> here:
>>
>> "Each row in the text file must have
the same number of values."
>> "genfromtxt : Load data with missing
values handled as specified."
>> " This function aims to be a fast
reader for simply formatted
>> files. The
>> `genfromtxt` function provides
more sophisticated handling of,
>> e.g.,
>> lines with missing values."
>>
>> Really I am trying to separate the
usage of loadtxt and genfromtxt to
>> avoid unnecessary duplication and
confusion. Part of this is
>> historical because loadtxt was added
in 2007 and genfromtxt was added
>> in 2009. So really certain features
of loadtxt have been 'kept' for
>> backwards compatibility purposes yet
these features can be 'abused' to
>> handle missing data. But I really
consider that any missing values
>> should cause loadtxt to fail.
>>
> OK, I was not aware of the design issues
of loadtxt vs. genfromtxt -
> you could probably say also for
historical reasons since I have not
> used genfromtxt much so far.
> Anyway the docstring statement
"Converters can also be used to
> provide a default value for
missing data:"
> then appears quite misleading, or an
invitation to abuse, if you will.
> This should better be removed from the
documentation then, or users
> explicitly discouraged from using
converters instead of genfromtxt
> (I don't see how you could completely
prevent using converters in
> this way).
>
>> The patch is incorrect because it
should not include a space in the
>> split() as indicated in the comment
by the original reporter. Of
> The split('\r\n') alone caused
test_dtype_with_object(self) to fail,
> probably
> because it relies on stripping the
blanks. But maybe the test is ill-
> formed?
>
>> course a corrected patch alone still
is not sufficient to address the
>> problem without the user providing
the correct converter. Also you
>> start to run into problems with
multiple delimiters (such as one space
>> versus two spaces) so you start down
the path to add all the features
>> that duplicate genfromtxt.
> Given that genfromtxt provides that
functionality more conveniently,
> I agree again users should be encouraged
to use this instead of
> converters.
> But the actual tab-problem causes in fact
an issue not related to
> missing
> values at all (well, depending on what
you call a missing value).
> I am describing an example on the ticket.
>
> Cheers,
>
Derek
>
>
_______________________________________________
> NumPy-Discussion mailing list
>
NumPy-Discussion@scipy.org
>
http://mail.scipy.org/mailman/listinfo/numpy-discussion