text file reformatting
Tim Chase
python.list at tim.thechases.com
Sun Oct 31 15:48:09 EDT 2010
> PRJ01001 4 00100END
> PRJ01002 3 00110END
>
> I would like to pick only some columns to a new file and put them to a
> certain places (to match previous data) - definition file (def.csv)
> could be something like this:
>
> VARIABLE FIELDSTARTS FIELD SIZE NEW PLACE IN NEW DATA FILE
> ProjID ; 1 ; 5 ; 1
> CaseID ; 6 ; 3 ; 10
> UselessV ; 10 ; 1 ;
> Zipcode ; 12 ; 5 ; 15
>
> So the new datafile should look like this:
>
> PRJ01 001 00100END
> PRJ01 002 00110END
How flexible is the def.csv format? The difficulty I see with
your def.csv format is that it leaves undefined gaps (presumably
to be filled in with spaces) and that you also have a blank "new
place in new file" value. If instead, you could specify the
width to which you want to pad it and omit variables you don't
want in the output, ordering the variables in the same order you
want them in the output:
Variable; Start; Size; Width
ProjID; 1; 5; 10
CaseID; 6; 3; 10
Zipcode; 12; 5; 5
End; 16; 3; 3
(note that I lazily use the same method to copy the END from the
source to the destination, rather than coding specially for it)
you could do something like this (untested)
import csv
f = file('def.csv', 'rb')
f.next() # discard the header row
r = csv.reader(f, delimiter=';')
fields = [
(varname, slice(int(start), int(start)+int(size)), width)
for varname, start, size, width
in r
]
f.close()
out = file('out.txt', 'w')
try:
for row in file('data.txt'):
for varname, slc, width in fields:
out.write(row[slc].ljust(width))
out.write('\n')
finally:
out.close()
Hope that's fairly easy to follow and makes sense. There might
be some fence-posting errors (particularly your use of "1" as the
initial offset, while python uses "0" as the initial offset for
strings)
If you can't modify the def.csv format, then things are a bit
more complex and I'd almost be tempted to write a script to try
and convert your existing def.csv format into something simpler
to process like what I describe.
-tkc
More information about the Python-list
mailing list