Legacy data parsing
spam.csubich+block at block.subich.spam.com
Fri Jul 8 21:50:46 CEST 2005
> I've just started to learn programming and was told this was a good
> place to ask questions :)
> Where I work, we receive large quantities of data which is currently
> all printed on large, obsolete, dot matrix printers. This is a problem
> because the replacement parts will not be available for much longer.
> So I'm trying to create a program which will capture the fixed width
> text file data and convert as well as sort the data (there are several
> different report types) into a different format which would allow it to
> be printed normally, or viewed on a computer.
Are these reports all of the same page-wise format, with fixed-width
columns? If so, then the suggestion about a state machine sounds good
-- just run a state machine to figure out which linetype you're on, then
extract the fixed width fields via slices.
name = line[x:y]
If that doesn't work, then pyparsing or DParser might work for you as a
more general-purpose parser.
More information about the Python-list