ASCII delimited files

Mike Fletcher mcfletch at vrtelecom.com
Fri Nov 12 06:23:00 CET 1999


I never use CSV or TSV files myself (well, not more than once or twice per
year), I was just approaching the problem as one of optimisation (I'm an
optimisation junky).  The trailing empty record is actually there in the
dataset I'm parsing.  Just an artefact of having a trailing \n at the end of
the dataset I'm multiplying to get the larger dataset (without it the
original dataset created for x*n, n+1, instead of 2n+1 records, as it would
concatenate the last and first records of each pair).  Haven't noticed the
_leading_ empty record in my tests, so no idea what to do with it.

Looks like C was the answer though :) .  Enjoy yourself,
Mike

-----Original Message-----
From: python-list-admin at python.org
[mailto:python-list-admin at python.org]On Behalf Of Darrell
Sent: November 11, 1999 6:18 PM
To: Mike Fletcher; Darrell; python-list at python.org
Subject: Re: ASCII delimited files


----- Original Message -----
From: Mike Fletcher <mcfletch at vrtelecom.com>
> Okay, here's an entirely different approach, extremely memory intensive,
> fairly fast.

I'm impressed if you just whipped this up today!
Should it trim the leading empty record ???
I noticed that it returns an additional empty record.

Here's the 'C' version with your tests.

Time for file of 365 bytes
 (5 * testset, 10 records found):
 0.0
Time for file of 3650 bytes
 (50 * testset, 100 records found):
 0.0
Time for file of 36500 bytes
 (500 * testset, 1000 records found):
 0.0160000324249
Time for file of 365000 bytes
 (5000 * testset, 10000 records found):
 0.15600001812
next test may take 15 seconds or so
Time for file of 3650000 bytes
 (50000 * testset, 100000 records found):
 1.89099991322
next test may take 45 seconds or so
Time for file of 10950000 bytes
 (150000 * testset, 300000 records found):
 7.32799994946

--Darrell


--
http://www.python.org/mailman/listinfo/python-list





More information about the Python-list mailing list