[Tutor] trouble with re

Kent Johnson kent37 at tds.net
Mon May 8 19:53:24 CEST 2006


Ertl, John wrote:
> I have a file with 10,000 + lines and it has a coma delimited string on each
> line.
> 
> The file should look like:
> 
> DFRE,ship name,1234567
> FGDE,ship 2,
> ,sdfsf
> 
> The ,sdfsf  line is bad data
> 
> p = re.compile('\d{7}$ | [,]$')   # this is the line that I can not get
> correct I an trying to find lines that end in a comma or 7 digits

Spaces are significant in regular expressions unless you compile them 
with the re.VERBOSE flag. Also you don't need to make a group for a 
single character. Try
p = re.compile('\d{7}$|,$')
or maybe
p = re.compile('(\d{7}|,)$')

Actually since the seven digits are preceded by the comma you could just 
make the digits optional:
p = re.compile(',(\d{7})?$')

Kent



More information about the Tutor mailing list