Questions about regex

Jared.S.Bauer at Jared.S.Bauer at
Fri May 29 20:26:07 CEST 2009


I'm new to python and I'm having problems with a regular expression. I
use textmate as my editor and when I run the regex in textmate it
works fine, but when I run it as part of the script it freezes. Could
anyone help me figure out why this is happening and how to fix it.
Here is the script:

# regular expression search and replace
import sys, os, re, string, csv

#Open the file and taking its data
myfile=open('Steve_query3.csv') #Steve_query_test.csv
#create an error flag  to loop the script twice
#store all file's data in the string object 'text'
text =

for i in range(2):
	#def textParse(text, reRun):
	print 'how many times is this getting executed', i

	#Now to create the newfile 'test' and write our 'text'
	newfile = open('Steve_query3_out.csv', 'w')
	#open the new file and set it with 'w' for "write"
	#loop trough 'text' clean them up and write them into the 'newfile'
			#sub(  	pattern, repl, string[, count])
			#"sub("(?i)b+", "x", "bbbb BBBB")" returns 'x x'.
	text = re.sub('(\<(/?[^\>]+)\>)', "", text)#remove the HTML
	text = re.sub('/<!--(.|\s)*?-->/', "", text) #remove comments  <!--[^
	text = re.sub('\/\*(.|\s)*?;}', "", text) #remove css formatting
	#remove a bunch of word formatting yuck
	text = re.sub(" ", " ", text)
	text = re.sub("<", "<", text)
	text = re.sub(">", ">", text)
	text = re.sub(""|&rquot;|“", "\'", text)
#The two following lines are the ones giving me the problems
	text = re.sub("w:(.|\s)*?\n", "", text)
	text = re.sub("UnhideWhenUsed=(.|\s)*?\n", "", text)
	text = re.sub(re.compile('^\r?\n?$', re.MULTILINE), '', text) #remove
the extra whitespace
	#now write out the new file and close it

	#open the newfile and run the script again
	#Open the file and taking its data

	myfile=open('Steve_query3_out.csv') #Steve_query_test.csv
	#store all file's data in the string object 'text'
	text =

Thanks for the help,


More information about the Python-list mailing list