[Tutor] trouble with re
Ertl, John
john.ertl at fnmoc.navy.mil
Mon May 8 19:40:29 CEST 2006
I have a file with 10,000 + lines and it has a coma delimited string on each
line.
The file should look like:
DFRE,ship name,1234567
FGDE,ship 2,
,sdfsf
The ,sdfsf line is bad data
Some of the lines are messed up...I want to find all lines that do not end
in a comma or seven digits and do some work on them. I can do the search
for just the last seven digits but I can not do the seven digits or the
comma at the end in the same search.
Any ideas
import re
import sys
import os
p = re.compile('\d{7}$ | [,]$') # this is the line that I can not get
correct I an trying to find lines that end in a comma or 7 digits
newFile = open("newFile.txt",'w')
oldFile = open("shipData.txt",'r')
for line in oldFile:
if p.search(line):
newFile.write(line)
else:
newFile.write("*BAD DATA " + line)
newFile.close()
oldFile.close()
More information about the Tutor
mailing list