need fast parser for comma/space delimited numbers
Michael A. Miller
mmiller3 at iupui.edu
Sun Mar 19 17:59:41 CET 2000
>>>>> "Les" == Les Schaffer <godzilla at netmeg.net> writes:
> I have written an application for reading in large amounts
> of space/comma delimited numbers from ASCII text files for
> statistical processing.
> I originally used re expresssions for splitting, but i was
> able to cut the time required for data file parsing down to
> a third by using string.split on the comma or space.
> Still, the app takes about 5 minutes to parse a typical set
> of data files. I'd like to drop that down to a minute of
> Which means i probably need to wrap in a C module with
> something like an sscanf. Or maybe just a function which
> find the delimiters and delivers the number parts of string
> (defined by delimiters) to atoi and atof functions.
> But before i get started, i imagine someone else has
> already done this.
> anyone have pointers to said code or suggestions? i'll
> happily post my code if there is none out there already.
TableIO  does exactly what you're looking for and is
reasonably fast I think. It is a C extension for reading data
from ascii files. Rather than using scanf, it uses fgets and
strtok to parse lines. It also allows you to flag certain lines
as "comment" lines by skipping any lies containing a specified
Michael A. Miller mmiller3 at iupui.edu
Krannert Institute of Cardiology, IU School of Medicine
More information about the Python-list