ASCII delimited files
Mike Fletcher
mcfletch at vrtelecom.com
Fri Nov 12 00:23:00 EST 1999
I never use CSV or TSV files myself (well, not more than once or twice per
year), I was just approaching the problem as one of optimisation (I'm an
optimisation junky). The trailing empty record is actually there in the
dataset I'm parsing. Just an artefact of having a trailing \n at the end of
the dataset I'm multiplying to get the larger dataset (without it the
original dataset created for x*n, n+1, instead of 2n+1 records, as it would
concatenate the last and first records of each pair). Haven't noticed the
_leading_ empty record in my tests, so no idea what to do with it.
Looks like C was the answer though :) . Enjoy yourself,
Mike
-----Original Message-----
From: python-list-admin at python.org
[mailto:python-list-admin at python.org]On Behalf Of Darrell
Sent: November 11, 1999 6:18 PM
To: Mike Fletcher; Darrell; python-list at python.org
Subject: Re: ASCII delimited files
----- Original Message -----
From: Mike Fletcher <mcfletch at vrtelecom.com>
> Okay, here's an entirely different approach, extremely memory intensive,
> fairly fast.
I'm impressed if you just whipped this up today!
Should it trim the leading empty record ???
I noticed that it returns an additional empty record.
Here's the 'C' version with your tests.
Time for file of 365 bytes
(5 * testset, 10 records found):
0.0
Time for file of 3650 bytes
(50 * testset, 100 records found):
0.0
Time for file of 36500 bytes
(500 * testset, 1000 records found):
0.0160000324249
Time for file of 365000 bytes
(5000 * testset, 10000 records found):
0.15600001812
next test may take 15 seconds or so
Time for file of 3650000 bytes
(50000 * testset, 100000 records found):
1.89099991322
next test may take 45 seconds or so
Time for file of 10950000 bytes
(150000 * testset, 300000 records found):
7.32799994946
--Darrell
--
http://www.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list