need fast parser for comma/space delimited numbers

Alex alex at somewhere.round.here
Sat Mar 18 11:59:31 EST 2000


> I have written an application for reading in large amounts of
> space/comma delimited numbers from ASCII text files for statistical
> processing.
> 
> I originally used re expresssions for splitting, but i was able to cut
> the time required for data file parsing down to a third by using
> string.split on the comma or space.
> 
> Still, the app takes about 5 minutes to parse a typical set of data
> files. I'd like to drop that down to a minute of possible.

Hmm, I must be missing something.  The following code takes about 8s on
a Sparc 10 and about 5 on a Linux w/ pentium II 400MHz.

import string, time

line = '356	0.23514	0.1784'

start_time = time.time ()
for i in 90000 * [None]:
    split_line = filter (None, string.split (line))
    n = int (split_line [0])
    x = float (split_line [1])
    y = float (split_line [2])

print time.time () - start_time    

...yet 90000 is about the number of lines you are parsing, right?

Alex.



More information about the Python-list mailing list