Working with Huge Text Files

Lorn Davies efoda at hotmail.com
Fri Mar 18 19:01:51 EST 2005


Hi there, I'm a Python newbie hoping for some direction in working with
text files that range from 100MB to 1G in size. Basically certain rows,
sorted by the first (primary) field maybe second (date), need to be
copied and written to their own file, and some string manipulations
need to happen as well. An example of the current format:

XYZ,04JAN1993,9:30:27,28.87,7600,40,0,Z,N
XYZ,04JAN1993,9:30:28,28.87,1600,40,0,Z,N
 |
 | followed by like a million rows similar to the above, with
 | incrementing date and time, and then on to next primary field
 |
ABC,04JAN1993,9:30:27,28.875,7600,40,0,Z,N
 |
 | etc., there are usually 10-20 of the first field per file
 | so there's a lot of repetition going on
 |

The export would ideally look like this where the first field would be
written as the name of the file (XYZ.txt):

19930104, 93027, 2887, 7600, 40, 0, Z, N

Pretty ambitious for a newbie? I really hope not. I've been looking at
simpleParse, but it's a bit intense at first glance... not sure where
to start, or even if I need to go that route. Any help from you guys in
what direction to go or how to approach this would be hugely
appreciated.

Best regards,
Lorn




More information about the Python-list mailing list