[Tutor] newbie text parsing question
Alan Trautman
ATrautman@perryjudds.com
Wed, 28 Aug 2002 11:41:16 -0500
Again if you are looking for a concept of how to do this rather than the
code I will give the approach I would use.
read the file
strip all newlines ('/n')
put newlines ('/n') after every colon
save the new file
open the new file
read every other line inserting a comma in between each element
add a newline ('/n') at the end of a record
append this to you master file contain all the previously parsed items
repeat until all record are parsed
hope that helps. It not really clever, smart or taking advantage of any
special features of Python but it should work.
You will have to add the extra step of splitting the records apart (I'm
Hoping for your sake they are the same length) and you can then repeat the
process every x number of lines in the original file.
Good luck
Alan
-----Original Message-----
From: Rob [mailto:rob@uselesspython.com]
Sent: Wednesday, August 28, 2002 11:32 AM
To: Python Tutor
Subject: RE: [Tutor] newbie text parsing question
There are different ways to get to the solution you're after. Do you want to
code for a situation in which you know you will always expect the same
format to the file, or do you want to account for files that don't have
precisely the same format?
For instance, will you always have only one "Problem:" listed?
Do you already have the grasp of reading and writing files to your
satisfaction? The Tutorial that tends to come ship with Python distributions
(and also easily found at python.org) has a section demonstrating File I/O.
There are also lots of samples out there, at sites like the Vaults of
Parnassus, the Python Cookbook site, and Useless Python.
Rob
-----Original Message-----
From: tutor-admin@python.org [mailto:tutor-admin@python.org]On Behalf Of Ron
Nixon
Sent: Wednesday, August 28, 2002 10:57 AM
To: tutor@python.org
Subject: [Tutor] newbie text parsing question
Ive got a file that looks like this:
Case Number: 076-2000 Recall Notification Report: RNR076-2000
Date Opened: 12/20/2000 Date Closed: 04/20/2001
Recall Class: 1 Press Release (Y/N): Y
Domestic Est. Number: 02040 M Name: Harper's Country Ham
Imported Product (Y/N): Y Foreign Estab. Number: N/A
City: Clinton State: KY Country: USA
Product: Country Ham
Problem: BACTERIA Description: LISTERIA
Total Pounds Recalled: 10,400 Pounds Recovered: 7,561
I'd like to be able to read all of the file in a extract the data following
the Title and ":" to produce some like this:
076-2000, RNR076-2000,04/20/2001,04/20/2001,1,Y,02040 M, Harper's Country
Ham, etc
that I can then import into a spreadsheet or database. I found nothing at
the Python.org site nor in the Text Processing using Python book. Any ideas?
thanks in advance
Ron
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor