[MATRIX-SIG] a faster array io module

Mike Miller miller5@uiuc.edu
22 Feb 1998 11:06:21 -0600


I decided that it was time to learn to extend python so I could
take a stab at a high speed (well at least fairly quick) module
to read tables of ASCII data in a file into NumPy arrays.  So I
wrote arrayio [1].  

At the moment, it contains only one function, arrayio.readASCII.
An arrayio.writeASCII is forthcoming.

readASCII does the following: 

- fgets and sscanf's through the file and stores stuff that looks
  like columns of numbers in a C array.
- To allow for comments in the input file, it skips any line that
  contains a "#" of "!".
- makes a NumPy array from the data and returns it.
- the shape of the returned array is M x N where M is the number
  of rows in the file and N is the number of columns in the input
  line with the /smallest/ number of columns (that sscanf
  finds). 
- No attempt is made to deal with non-numeric data, other than the
  commenting scheme.

I've tested this with a 6 x 9818 array that I had lying around
and found that it's fairly fast - about 18 times quicker than
Konrad Hinsen's all-python ArrayIO, which was the inspiration.
But I have some questions that maybe folks here can answer.

1) The module uses a static C array that is 40 x 10000.  Can
   anyone teach me to dynamically extend this as needed?

2) Is it possible to use .append somehow on Python arrays in the
   C module?  If I could to that, I'd just append each line as I
   read it from the file.

3) Any suggestions on fancier parsing schemes?

Any other comments or suggestions are welcome.

Regards, Mike

[1] <URL:http://www.npl.uiuc.edu/~miller/python>


-- 
Michael A. Miller                                miller5@uiuc.edu
  Department of Physics, University of Illinois, Urbana-Champaign
  PGP public key available on request

_______________
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
_______________