Regex Generator From Multiple Files

MRAB google at mrabarnett.plus.com
Mon Jan 5 19:48:20 EST 2009


James Pruitt wrote:
> I am looking for a way given a number of files, say 3, that represent 
> technical support tickets in the same format to generate regular 
> expressions for the different fields automatically.
> 
> An example from of one line from each file:
> Date: 12/30/2008 Room: 457 Building: Main
> Date: 12/31/2008 Room: A21 Building: Annex
> Date: 1/4/2009 Room: L69 Building: Library
> 
> The program would then, possibly using the python diff library, generate 
> the regular expression needed to parse out different fields. In this 
> case it might return a tuple like 
> ("^Date:[\w]+(.*)[\w]+Room","Room:[\w]+(.*)[\w]+Building","Building:[\w]+(.*)[\w]+$") 
> that would match each of the fields based on the common data and sort of 
> assume that what doesn't change between them is data we are looking for.
> 
Why not just assume that each field consists of a word terminated by a 
colon, then some text, then the next field or the end of the line?



More information about the Python-list mailing list