Pattern Matching

Greg Lindstrom greg.lindstrom at
Mon Jul 19 20:55:57 CEST 2004


I'm running Python 2.2.3 on Windows XP "Professional" and am reading a file
wit 1 very long line of text (the line consists of multiple records with no
cr/lf).  What I would like to do is scan for the occurrence of a specific
pattern of characters which I expect to repeat many times in the file.
Suppose I want to search for "Start: mm/dd/yy" and capture the mm/dd/yyyy
data for processing each time I find it.  This is the type of problem I used
to solve with <duck>Perl<\duck> in a former lifetime using regular
expressions.  The following does not work, but is the flavor of what I want
to do:

long_line_of_text = 'Start: 1/1/2004 and some stuff.~Start: 2/3/2004 stuff.
~Start 5/1/2004 morestuff.~'
while re.match('Start:\ (\D?/\D?/\D+)', long_line_of_text):
    # process the date string here which I hoped to catch in the parenthesis

I'd like this to keep matching and processing the string as long as it keeps
matching the pattern, bopping down the string as it goes.

Another way to handle this is to replace all of the tildes with linefeeds
(tildes are the end of segment marker), or split the records on the tilde
and go from there.  I'd just like to know how I could do it with the regular

Thanks for your help,

