[TriZPUG] More Fun With Text Processing

Josh Johnson jj at email.unc.edu
Fri Apr 3 17:31:57 CEST 2009


Ok all,
Since we've got a brain trust of pythonistas that know how to deal with 
strings, here's a problem I'm facing right now that I'd like some input on:

I've got a tabular list, it's the output from a command-line program, 
and I need to parse it into some sort of structure.

Here's an example of the data (the headings and column width will vary):
TARGET         VOLUME GROUP        LENGTH     AVAILABLE         NPE  MIRROR
1.1               HIGHAVAIL    5001.023GB    4501.008GB     1192337  2.1
1.3                  BACKUP    5001.023GB    4250.759GB     1192337
1.4                  BACKUP    3000.613GB    3000.353GB      715402
2.2               HIGHAVAIL    5001.023GB    5001.015GB     1192337  1.2
2.3                  BACKUP    5001.023GB    5000.763GB     1192337
2.4                  BACKUP    3000.613GB    3000.353GB      715402

I'd like a structure I can work with, like say, a list of hashes.

My initial approach involves treating the header row as the guide for 
the field lengths, and then extracting substrings for each field in each 
row.

I also thought about just doing a split on spaces, but some of the 
fields could have spaces in their data.

What do you guys think?

JJ


More information about the TriZPUG mailing list