[Numpy-discussion] using loadtxt() for given number of rows?

Tue Feb 1 03:42:14 EST 2011

On Mon, 31 Jan 2011, Christopher Barker wrote:

> On 1/31/11 4:39 AM, Robert Cimrman wrote:
>> I work with text files which contain several arrays separated by a few
>> lines of other information, for example:
>>
>> POINTS 4 float
>> -5.000000e-01 -5.000000e-01 0.000000e+00
>> 5.000000e-01 -5.000000e-01 0.000000e+00
>> 5.000000e-01 5.000000e-01 0.000000e+00
>> -5.000000e-01 5.000000e-01 0.000000e+00
>>
>> CELLS 2 8
>> 3 0 1 2
>> 3 2 3 0
>
>> I have used custom Python code with loops to read similar files, so the
>> speed was not too good. Now I wonder if it would be possible to use the
>> numpy.loadtxt() function for the "array-like" parts. It supports passing
>> an open file object in, which is good, but it wants to read the entire
>> file, which does not work in this case.
>>
>> It seems to me, that an additional parameter to loadtxt(), say "nrows" or
>> "numrows", would do the job,
>
> I agree that that would be a useful feature. However, I'm not sure it
> would help performance much -- I think loadtxt is written in python as well.

I see. Anyway, it would allow me to reduce my code size, which counts 
as well to be a good thing. So there is now a new enhancement ticket [1].

> One option in the meantime. If you know how many rows, you presumable
> know how many items on each row. IN that case, you can use:
>
> np.fromfile(open_file, sep=' ', count=num_items_to_read)
>
> It'll only work for multi-line text if the separator is whitespace,
> which it was in your example. But if it does, it should be pretty fast.

Good idea, the prerequisites are not met always, but often enough.

Thanks!

r.

[1] http://projects.scipy.org/numpy/ticket/1731