[Numpy-discussion] Add `nrows` to `genfromtxt`

Alexander Belopolsky ndarray at mac.com
Sat Nov 1 16:41:21 EDT 2014


On Sat, Nov 1, 2014 at 3:15 PM, Warren Weckesser <warren.weckesser at gmail.com
> wrote:

> Is there wider interest in such an argument to `genfromtxt`?  For my
> use-cases, `max_rows` is sufficient.  I can't recall ever needing the full
> generality of a slice for pulling apart a text file.  Does anyone have
> compelling use-cases that are not handled by `max_rows`?
>

It is occasionally useful to be able to skip rows after the header.  Maybe
we should de-deprecate skip_rows and give it the meaning different from
skip_header in case of names = None?  For example,

genfromtxt(fname,  skip_header= 3, skip_rows = 1, max_rows = 100)

would mean skip 3 lines, read column names from the 4-th, skip 5-th,
process up to 100 more lines.  This may be useful if the file contains some
meta-data about the column below the header line.  For example, it is
common to put units of measurement below the column names.

Another application could be processing a large text file in chunks, which
again can be covered nicely by  skip_rows/max_rows.

I cannot think of a situation where I would need more generality such as
reading every 3rd row or rows with the given numbers.  Such processing is
normally done after the text data is loaded into an array.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20141101/97c9f24b/attachment.html>


More information about the NumPy-Discussion mailing list