[MATRIX-SIG] a faster array io module

Konrad Hinsen hinsen@ibs.ibs.fr
Tue, 24 Feb 1998 11:12:46 +0100


> I decided that it was time to learn to extend python so I could
> take a stab at a high speed (well at least fairly quick) module
> to read tables of ASCII data in a file into NumPy arrays.  So I
> wrote arrayio [1].  

Sounds good...

> But I have some questions that maybe folks here can answer.
> 
> 1) The module uses a static C array that is 40 x 10000.  Can
>    anyone teach me to dynamically extend this as needed?

There are several solution, as always, but here is one suggestion:

1) Scan the first line to find the number of columns, then
   rewind the file for real processing.

2) Create an empty Python list (PyList_New(0)).

3) For each line, allocate a 1D array (yes, a NumPy array, using
   PyArray_FromDims()), read the data, and append the array to the list.

4) Convert the list of arrays to a 2D array using
   PyArray_ContiguousFromObject().

That is more or less a C translation of my Python code, and it saves
you all the trouble of dynamic memory allocation in C. It should still
be a lot faster than the pure Python code because the I/O is handled
in C. In fact, I would be surprised if a "pure" C version were faster.

> 2) Is it possible to use .append somehow on Python arrays in the
>    C module?  If I could to that, I'd just append each line as I
>    read it from the file.

Not on arrays, but on lists - that's essentially what my scheme does.

> 3) Any suggestions on fancier parsing schemes?

I don't know what you do about data types, but here's a wish list:

- Support for integer, real, and complex arrays.
- Support for Fortran-style double precision input (i.e. using a D
  as the exponent marker instead of an E).
- A verification that all lines have the same number of values.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                          | E-Mail: hinsen@ibs.ibs.fr
Laboratoire de Dynamique Moleculaire   | Tel.: +33-4.76.88.99.28
Institut de Biologie Structurale       | Fax:  +33-4.76.88.54.94
41, av. des Martyrs                    | Deutsch/Esperanto/English/
38027 Grenoble Cedex 1, France         | Nederlands/Francais
-------------------------------------------------------------------------------

_______________
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
_______________