[Numpy-discussion] Add `nrows` to `genfromtxt`
Alexander Belopolsky
ndarray at mac.com
Sat Nov 1 16:41:21 EDT 2014
On Sat, Nov 1, 2014 at 3:15 PM, Warren Weckesser <warren.weckesser at gmail.com
> wrote:
> Is there wider interest in such an argument to `genfromtxt`? For my
> use-cases, `max_rows` is sufficient. I can't recall ever needing the full
> generality of a slice for pulling apart a text file. Does anyone have
> compelling use-cases that are not handled by `max_rows`?
>
It is occasionally useful to be able to skip rows after the header. Maybe
we should de-deprecate skip_rows and give it the meaning different from
skip_header in case of names = None? For example,
genfromtxt(fname, skip_header= 3, skip_rows = 1, max_rows = 100)
would mean skip 3 lines, read column names from the 4-th, skip 5-th,
process up to 100 more lines. This may be useful if the file contains some
meta-data about the column below the header line. For example, it is
common to put units of measurement below the column names.
Another application could be processing a large text file in chunks, which
again can be covered nicely by skip_rows/max_rows.
I cannot think of a situation where I would need more generality such as
reading every 3rd row or rows with the given numbers. Such processing is
normally done after the text data is loaded into an array.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20141101/97c9f24b/attachment.html>
More information about the NumPy-Discussion
mailing list