[Tutor] Help with re.sub()

John Clark clajo04 at mac.com
Fri Mar 17 05:20:33 CET 2006


Hi,

I have a file that is a long list of records (roughly) in the format

objid at objdata

So, for example:

id1 at data1
id1 at data2
id1 at data3
id1 at data4
id2 at data1
....

What I would like to do is run a regular expression against this and
wind up with:

id1 at data1@data2 at data3@data4
id2 at data1

So I ran the following regex against the string:

re.compile(r'([^@]*)@(.*)\n\1@(.*)').sub(r'\1\2\3', string)

and I wound up with:

id1 at data1@data2
id1 at data3@data4
id2 at data1

So, my questions are:
(1) Is there any way to get a single regular expression to handle
overlapping matches so that I get what I want in one call?
(2) Is there any way (without comparing the before and after strings) to
know if a re.sub(...) call did anything?

I suppose I could do something like:

pattern = re.compile(r'([^@]*)@(.*)\n\1@(.*)')

while(pattern.search(string)):
    string = pattern.sub(r'\1\2\3', string)

but I would like to avoid the explicit loop if possible...

Actually, should I be able to do something like that?  If I execute it
in my debugger, my string gets really funky... like the re is losing
track of what the groups are... and I end up with a single really long
string rather than what I expect..


Any help on this would be appreciated.

-jdc






More information about the Tutor mailing list